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Course objectives 



In order to reach the more interesting and useful ideas, we shall adopt a fairly 
brutal approach to some early material. Lengthy proofs will sometimes be left 
out, though full versions will be made available. By the end of the course, you 
should have a good understanding of normed vector spaces, Hilbert and Banach 
spaces, fixed point theorems and examples of function spaces. These ideas will be 
illustrated with applications to differential equations. 

Books 

You do not need to buy a book for this course, but the following may be useful for 
background reading. If you do buy something, the starred books arc recommended 
[1] Functional Analysis, W. Rudin, McGraw-Hill (1973). This book is thorough, 
sophisticated and demanding. 

[2] Functional Analysis, F. Riesz and B. Sz.-Nagy, Dover (1990). This is a classic 
text, also much more sophisticated than the course. 

[3]* Foundations of Modern Analysis, A. Friedman, Dover (1982). Cheap and 
cheerful, includes a useful few sections on background. 

[4]* Essential Results of Functional Analysis, R.J. Zimmer, University of Chicago 

Press (1990). Lots of good problems and a useful chapter on background. 

[5]* Functional Analysis in Modern Applied Mathematics, R.F. Curtain and A.J. 

Pritchard, Academic Press (1977). This book is closest to the course. 

[6]* Linear Analysis, B. Bollobas, Cambridge University Press (1995). This book 

is excellent but makes heavy demands on the reader. 
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CHAPTER 1 



Normed Linear Spaces 

A linear space is simply an abstract version of the familiar vector spaces R, R 2 , 
R 3 and so on. Recall that vector spaces have certain algebraic properties: vectors 
may be added, multiplied by scalars, and vector spaces have bases and subspaces. 
Linear maps between vector spaces may be described in terms of matrices. Using the 
Euclidean norm or distance, vector spaces have other analytic properties (though 
you may not have called them that): for example, certain functions from R to R 
are continuous, diffcrcntiablc, Riemann intcgrablc and so on. 

We need to make three steps of generalization. 
Bases: The first is familiar: instead of, for example, R 3 , we shall sometimes want 
to talk about an abstract three-dimensional vector space V over the field R. This 
distinction amounts to having a specific basis {ei,e 2 ,e 3 } in mind, in which case 
every element of V corresponds to a triple (a, b, c) = aei + be2 + ce% of reals - or 
choosing not to think of a specific basis, in which case the elements of V are just 
abstract vectors v. In the abstract language we talk about linear maps or operators 
between vector spaces; after choosing a basis linear maps become matrices - though 
in an infinite dimensional setting it is rarely useful to think in terms of matrices. 
Ground fields: The second is fairly trivial and is also familiar: the ground field 
can be any field. We shall only be interested in R (real vector spaces) and C 
(complex vector spaces). Notice that C is itself a two-dimensional vector space 
over R with additional structure (multiplication). Choosing a basis for C 

over R we may identify z € C with the vector (3?(z), 3(z)) e R 2 . 
Dimension: In linear algebra courses, you deal with finite dimensional vector 
spaces. Such spaces (over a fixed ground field) are determined up to isomor- 
phism by their dimension. We shall be mainly looking at linear spaces that are 
not finite-dimensional, and several new features appear. All of these features may 
be summed up in one line: the algebra of infinite dimensional linear spaces is in- 
timately connected to the topology. For example, linear maps betwee n R 2 and R 2 
are automatically continuous. For infinite dimensional spaces, some linear maps 
arc not continuous. 

1. Linear (vector) spaces 

Definition 1.1. A linear space over a field k is a set V equipped with maps 
© : V x V — )■ V and • : k x V — >• V with the properties 

(1) x © y = y © x for all x, y € V (addition is commutative); 

(2) (x © y) © z = x © (y © z) for all i,j,z£ V (addition is associative); 

(3) there is an element e V such that x © = © x — x for all x e V (a zero 
element); 

(4) for each x € V there is a unique element — x e V with x © (—a;) = (additive 
inverses); 
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(notice that (V, +) therefore forms an abelian group) 

(5) a - {(3 ■ x) ~ (a/3) • x for all a, (3 £ k and ieV; 

(6) (a + 0) ■ x = a ■ x © f3 ■ x for all a, f3 £ k and i e F (scalar multiplication 
distributes over scalar addition); 

(7) a ■ (x ® y) = a ■ x © a ■ y for all a £ fc and x,y £ V (scalar multiplication 
distributes over vector addition); 

(8) 1 • x = x for all x £ 1/ where 1 is the multiplicative identity in the held k. 

Example 1.2. [1] Let V = R n = {x = (xi, . . .,x n ) \ x { £ R} with the usual 
vector addition and scalar multiplication. 

[2] Let V be the set of all polynomials with coefficients in R of degree < n with 
usual addition of polynomials and scalar multiplication. 

[3] Let V be the set M( m „)(C) of complex-valued m x n matrices, with usual 
addition of matrices and scalar multiplication. 

[4] Let denote the set of infinite sequences (xx, X2, £3, • • • ) that are bounded: 
sup{|x„|} < 00. Then is linear space, since sup{|x„ + y n \} < sup{|a; n |} + 
sup{|y„|} < 00 and sup{|aa;„|} = \a\ sup{|x n |}. 

[5] Let C(S) be the set of continuous functions / : S — >• R with addition (f(Bg){x) — 
fix) + g(x) and scalar multiplication (a • f)(x) = af(x). Here S is, for example, 
any subset of R. The dimension of C(S) is infinite if S is an infinite set, and is 
exactly |5| if nolQ. 

[6] Let V be the set of Riemann-integrable functions / : (0,1) — > R which are 
square-integrable: that is, with the property that J Q \f{x)\ 2 dx < 00. We need to 
check that this is a linear space. Closure under scalar multiplication is clear: if 
f Q \f(x)\ 2 dx < 00 and a e R then f Q \af(x)\ 2 dx = \a\ 2 \f(x)\ 2 dx < 00. For 
vector addition we need the Cauchy-Schwartz inequality: 

\f(x)+g(x)\ 2 dx < f 1 {\f(x)\ 2 +2\f(x)\\g(x)\ + \g(x)\ 2 )dx 





< J\f{x)\ 2 dx+^j\f{x)\ 2 d x y (J\g{x)\*ta ' 

1 

2 



\g(x)\ dx < 00. 

[7] Let C°°[a, b] be the space of infinitely differentiable functions on [a, b]. 
[8] Let Q be a subset of R™, and C k (Q) the space of k times continuously differen- 
tiable functions. This means that if a = (ai , . . . , a n ) £ N n has |a| = a\ -\ — • + a n < 
k, then the partial derivatives 

f 



dxl 1 . . . dxt 
exist and are continuous. 



This may be seen as follows. If S = {si, . . . , s n } is finite, then the map that sends a function 
/ G C(S) to the vector (/(si), . . . , /(s„)) 6 R" is an isomorphism of linear spaces. If S is infinite, 
then the map that sends a polynomial / G M.[x] to the function / £ C(S) is injective (since 
two polynomials that agree on infinitely many values must be identical). This shows that C(S) 
contains an isomorphic copy of an infinite-dimensional space, so must be infinite-dimensional. 
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From now on we will drop the special notation ®, • for vector addition and 
scalar multiplication. We will also (normally) use plain letters x, y and so on for 
elements of linear spaces. 

As in the linear algebra of finite-dimensional vector spaces, subsets of linear 
spaces that are themselves linear spaces are called linear subspaces. 

2. Linear subspaces 

Definition 1.3. Let V be a linear space over the field k. A subset W C V 
is a linear subspace of V if for all x, y € W and a, (3 € k, the linear combination 
ax + /3y € W. 

Example 1.4. [1] The set of vectors in R n of the form (x\, x 2 , x 3 , 0, . . . , 0) 
forms a three-dimensional linear subspace. 

[2] The set of polynomials of degree < r forms a linear subspace of the the set of 
polynomials of degree < n for any r < n. 

[3] (cf. Example 0(8)) The space C k+1 (n) is a linear subspace of C fc (^)- 

3. Linear independence 

Let V be a linear space. Elements x\, x%, ■ ■ ■ , x n oiV are linearly dependent if 
there are scalars a\, . . . , a n (not all zero) such that 

a\X\ + • ■ ■ + a n x n = 0. 

If there is no such set of scalars, then they are linearly independent. 
The linear span of the vectors x% , X2, • ■ ■ , x n is the linear subspace 

f " 1 

spanjxi, . . . , x„} = < x — ctj^j | <Xj G > . 

I ^ X J 

Definition 1.5. If the linear space V is equal to the span of a linearly inde- 
pendent set of n vectors, then V is said to have dimension n. If there is no such 
set of vectors, then V is infinite-dimensional. 

A linearly independent set of vectors that spans V is called a basis for V. 

Example 1.6. [1] (cf. Examplc ll.2f l)) The space R ra has dimension n; the stan- 
dard basis is given by the vectors e\ — (1, 0, . . . , 0), e 2 = (0,1,0, ... ,0), ... ,e n = 
(0,...,0,1). 

[2] (cf. Example ll.2| 2]) A basis is given by {l,t,t 2 , . . . ,t n }, showing the space to 
have dimension (n+ 1). 

[3] Examples 11.21 [4] , [5] , [6] , [7] , [8] are all infinite-dimensional. 

4. Norms 

A norm on a vector space is a way of measuring distance between vectors. 

Definition 1.7. A norm on a linear space V over k is a non-negative function 
|| ■ || : V — > M with the properties that 

(1) ||x|| = if and only if x = (positive definite); 

(2) ||a; + y\\ < \\x\\ + \\y\\ for all x,y £ V (triangle inequality); 

(3) ||ax|| = |a|||x|| for all x £ V and a £ k. 
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In Definition [TT7J3) we are assuming that k is R or C and | • | denotes the usual 
absolute value. If || • || is a function with properties (2) and (3) only it is called a 
semi-norm. 

Definition 1.8. A normed linear space is a linear space V with a norm || • | 
(sometimes we write || • 

Definition 1.9. A set C in a linear space is convex if for any two points 
x,y G C, tx + (1 - t)y G C for all t G [0, 1]. 



Definition 1.10. A norm ||-|| is strictly convex if ||x|| = 1, ||y|| = 1, ||x+y|| = 2 
together imply that x = y. 

We won't be using convexity methods much, but for each of the examples try to 
work out whether or not the norm is strictly convex. Strict convexity is automatic 
for Hilbert spaces. 

Example 1.11. [1] Let V = R n with the usual Euclidean norm 

1/2 

in \ 

To check this is a norm the only difficulty is the triangle inequality: for this we use 
the Cauchy-Schwartz inequality. 

[2] There are many other norms on R™, called the p-norms. For 1 < p < 00 defined 

i/p 




' J I'' 




Then || ■ || p is a norm on V: to check the triangle inequality use Minkowski's 
Inequality 

E ■'■ • >h ! ) <(e 

There is another norm corresponding to p = 00, 

IMU = max {\xj\}. 

l<j<n 

It is conventional to write £™ for these spaces. Notice that the linear spaces £™ and 
Pi have exactly the same elements. 

[3] Let X = loo be the linear space of bounded infinite sequences (cf. Example 
11.2( 4] ). Consider the function 




If we restrict attention to the linear subspace on which || • || p is finite, then || • || p is a 
norm (to check this use the infinite version of Minkowski's inequality). This gives 
an infinite family of normed linear spaces, 

l p = {x = (xi,x 2 , . . . ) I ||x|| p < 00}. 
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Notice that for p < oo there is a strict inclusion £ p C £oo- Indeed, for any p < q 
there is a strict inclusion £ p C £ q so l v is a linear subspace of £ g . That is, the sets 
£ p and £ q for p ^ q do not contain the same elements. 

[4] Let X — C[a,b], and put ||/|| = sup t6 r a M \ f(t)\. This is called the uniform or 
supremum norm. Why is is finite? 

[5] Let X = C[a,b], and choose 1 < p < oo. Then (using the integral form of 
Minkowski's inequality) we have the p-norm 




[6] (cf. Example 11.2( 6]). Let V be the set of Riemann-integrable functions / : 
(0, 1) — > M which are square-integrable. Let ||/||2 = L \f(x)\ 2 dx < oo. Then V is 
a normed linear space. 

5. Isomorphism of normed linear spaces 

Recall form linear algebra that linear spaces V and W are (algebraically) iso- 
morphic if there is a bijection T : V — > W that is linear. 

T(ax + 0y) = aT(x) + pT(y) 

for all a, P € k and x, y € V. 

A pair (X, || • ||x), (Y, \\ • ||y) of normed linear spaces are (topologically) iso- 
morphic if there is a linear bijection T : X — > Y with the property that there are 
positive constants a, b with 

(1) a\\x\\ x <\\T(x)\\ Y <b\\x\\ x . 

We shall usually denote topological isomorphism by X = Y. 

Lemma 1.12. If X and Y are n- dimensional normed linear spaces over K (or 
C ) then X and Y are topologically isomorphic. 

If the constants a and b in equation ((T|) may both be taken as 1, so ||T(x)||y = 
||x||x, then T is called an isometry and the normed spaces X and Y are called 
isometric. 

Example 1.13. The real linear spaces (C, | • |) and (R 2 , || • H2) are isometric. 

If Y is a subspace of a linear normed space (X, || • \\x) then || • \\x restricted to 
Y makes Y into a normed subspace. 

EXAMPLE 1.14. Let Y denote the space of infinite real sequences with only 
finitely many non-zero terms. Then Y is a linear subspace of £ p for any 1 < p < 00 
so the p-norm makes Y into a normed space. 

6. Products of normed spaces 

If (X, || ■ ||x) and (Y, || • are normed linear spaces, then the product 

XxY = {(x,y) \xeX,yeY} 

is a linear space which may be made into a normed space in many different ways, 
a few of which follow. 

Example 1.15. [1] \\(x,y)\\ = (\\x\\ x + \\y\\ Y ) 1/p ; 
[2] \\(x,y)\\=m a x{\\x\\ x ,\\y\\ Y }. 
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7. Continuous maps between normed spaces 

We have seen continuous maps between R and R in first year analysis. To make 
this definition we used the distance function \x — y\ on R: a function / : R — > R is 
continuous if 

(2) Vael, Ve>0, 3 <5 > such that \x - a\ < S => \f(x) - f(a)\ < e. 

Looking at @, we see that exactly the same definition can be made for maps 
between linear normed spaces, which in view of Example 11.111 will give us the 
possibility of talking about continuous maps between spaces of functions. Thus, on 
suitably defined spaces, questions like "is the map / H> /' continuous?" or "is the 
map / h-> J Q f" continuous?" can be asked. 

Definition 1.16. A map / : X —> Y between normed linear spaces (X, || • 
and (Y, |j • \\y) is continuous at a £ X if 

V e > 3 6 = 5(e,a) > such that \\x - a\\ x < 5 =^ ||/(x) - f(a)\\ Y < e. 

If / is continuous at every a£l then we simply say / is continuous. 
Finally, / is uniformly continuous if 

V e > 3 5 = 5{e) > such that \\x-y\\ x < S \\f(x)-f(y)\\ Y <eVx,y£X. 

Example 1.17. [1] The map x m> x 2 from (R, | ■ |) to itself is continuous but 
not uniformly continuous. 

[2] Let f(x) — Ax be the non-trivial linear map from R n to R m (with Euclidean 
norms) defined by the m x n matrix A = (ay). Using the Cauchy-Schwartz in- 
equality, we see that / is uniformly continuous: fix a € R" and b = Aa. Then for 
any x £ R™ we have 

m n 

\\Ax-Aa\\ 2 = ^l^aijixj - aj )\ 2 
i=i j=i 



< EE 141 E 

»=i \i=i / y=i 

= C 2 \\x-a\\ 2 

where C 2 = 532=iSj=i \ a ij\ 2 > 0- It follows that / is uniformly continuous, and 
we may take S — e/C. 

[3] Let X be the space of continuous functions [—1,1] —> R with the sup norm (cf. 
Example II .11| 4]). Define a map F : X — > X by F(u) — v, where 

v(t) = 1 + / (sin u(s) + tans) ds. 
Jo 

The map F is uniformly continuous on X. Notice that F is intimately connected 
to a certain differential equation: a fixed point for F (that is, an element u 6 X 
for which F(u) — u) is a continuous solution to the ordinary differential equation 

du 

— = sin(u) + tan(t); u(0) = 1. 
dt 

in the region t G [—1,1]. We shall see later that F does indeed have a fixed point 
- knowing that F is uniformly continuous is a step towards this. To see that F is 
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continuous, calculate 



\\F(u)-F(v)\\ 



sup \F(u)(t) - F(v)(t)\ 
te[-i,i] 

1+ / (sin u(s) + tan s)ds 



sup 

te[-i.i] 



= sup 
te[-i,i] 

< sup 
te[-i,i] 

< llu — v\ 



1+ / (sin i;(s) + tan s)ds 



(sinu(s) — sinv(s)) ds 



j sinu(s) — sinu(s)| 



Notice we have used the inequality | sinu — sinw| < \u — v\, an easy consequence of 
the Mean Value Theorem. 

[4] Let X be the space of complex-valued square-integrable Riemman integrable 
functions on [0,1] with 2-norm (cf. Example 11.11( 6]). Define a map F : X X 
by F(u) = v, with 



v(t) 



u 2 (s)ds. 



Then F is continuous (but not uniformly continuous): 



\Fu(t) - Fv(t)\ 



< 



< 



(u 2 (s)-v 2 (s))ds\ 
(\u(s)\+\v(s)\)(\u(s)\-\v(s)\)ds 

1/2 



(\u(s)\ + \v(s)\yds 



(\u(s)-v( S )\ 2 )ds 



1/2 



so that 

\\Fu-Fv\\ 2 < sup \u{t)-v(t)\<(\\\u\ + \v\\\ 2 )\\u-v\\. 
te[o,i] 

[5] The same map as in [4] applied to square-integrable Riemann integrable func- 
tions on [0, oo ) is not continuous. To see this, let a, b S R and define 

{a, 0<t<2b 2 
ia, 2b 2 < 4b 2 
otherwise. 

Then ||u — 1 1 2 = 2ab. On the other hand, 

2 t, 0<t< 2b 2 

ib 2 a 2 - a 2 t, 2b 2 <t< 4b 2 



F(u)(t) 



0, 



otherwise. 



Then \\F(u) - F(0)\\ 2 



fa 4 

16„4l6 



b e . Now, given any S > we may choose constants 



a, b with 2ab < S but b — 1. That is, given any 5 > there is a function u 
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with the property that ||m — 0|| < 5 but \\F(u) — F(0)\\ — 1, showing that F is not 
continuous. 

The moral is that the topological properties of infinite spaces are a little 
counter-intuitive. 

8. Sequences and completeness in normed spaces 

Just as for continuity, we can use the norm on a normed linear space to define 
convergence for sequences and series in a normed space using the corresponding 
notion for R. 

Let X = (X, || • || x) be a normed linear space. A sequence (x n ) in X is said to 
converge to a e X if 

\\x n — a\\ — > 

as n — > oo. 

Similarly, a series Y^?=i x n converges if the sequence of partial sums (sjv) 
denned by sn = J2n=i Xn ^ s a convergent sequence. 

Example 1.18. [1] If (xj) is a sequence in R™, with Xj = (xy , . . . ,xy), then 
check that 

\\xj\\ P -> 

(that is, (xj) converges to in the space £") if and only if x^ — > in R for each 
k = 1, . . . , n. 

[2] For infinite-dimensional spaces, it is not enough to check convergence on each 
component using a basis. Let (xj) be the sequence in £ p defined by 

Xj = (0,0,. ..,1,...) 

(where the 1 appears in the jth position. Then if we write xj = (x^\x^\ . . . ) we 

certainly have x^ — > as j — >• oo for each k. However, we also have ||xj|| p = 1 for 
all j, so the sequence is certainly not converging to 0. Indeed, it is not converging 
to anything. 

Lemma 1.19. A map F : X — > Y between normed linear spaces is continuous 
at a G X if and only if 

lim F(x n ) = F(a) 

n— >oo 

for every sequence (x n ) converging to a. 

PROOF. Replace | • | with || • || in the proof of this statement for functions 
R -> R. □ 

Definition 1.20. A sequence (x n ) is a Cauchy sequence if 

V e > 3 N such that n 1 m> N => \\x n — x m \\ < e. 

It is clear that a convergent sequence is a Cauchy sequence. We know that in 
the normed linear space (R, | • |) the converse also holds, and it is a simple matter 
to check that in R™ the converse holds. In many reasonable infinite-dimensional 
normed linear spaces however there are Cauchy sequences that do not converge. 

Definition 1.21. A normed linear space is said to be complete if all Cauchy 
sequences are convergent. 
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Example 1.22. [1] The sequence 3, j^, j^j, Htm ' • • • * s a Cauchy sequence 
of rationals converging to the real number 7r. 

[2] Consider the space C[0, 1] of continuous functions under the sup norm (cf. Ex- 
ample [TTTTJ4] ) . This is complete. 

[3] The space C[0, 1] under the 2-norm (cf. Example 1 1.11 [ 5]) is not complete. To 
see this, consider the sequence of functions 



u n (t) 



0> 0<t<i-i 

nt _ 2£ _i_ 1 _L _L <" ^ <C — 

2 4^2' 2 n — — 2 

1 ! + £<*<!• 



Then (u n ) is a Cauchy sequence, since 
\\u m -u n \\l = / \u m (t) - u n (t)\ 2 dt 



o 

1/2 /•1/2+1/m 

|u ro (i)-u„(t)| 2 dt + / |u m (t) - u n (t)| 2 di 

' 1/2-1/ m J 1/2 

— > as m > n — > oo. 

We claim that the sequence (u n ) is not convergent in C[0, 1] under the 2-norm. To 
see this, let g be the function defined by g(t) = for < t < i and g(t) = 1 for 
i < t < 1, and assume that there is a continuous function / with ||u„ — /||2 — > 
as n — > oo. It is clear that \\u n — g\\% — >• as n oo also, so we must have 

(3) ||/-<7|| 2 = 0. 

Now examine /(|). If /(|) ^= 3(5) = then \ f — g\ must be positive on — <5, i) 
for some <5 > 0, which contradicts ([3]). We must therefore have f{\) = 0; but in 
this case \ f — g\ must be positive on (|, 5 + S) for some <5 > 0, again contradicting 
©. We conclude that there is no continuous function / that is the 2-norm limit 
of the sequence (u n ). Thus the normed space (C[0, 1], || • H2) is not complete. 

9. Topological language 

There are certain properties of subsets of normed linear spaces (and other 
more general spaces) that we use very often. Topology is a subject that begins 
by attaching names to these properties and then develops a shorthand for talking 
about such things. 

Definition 1.23. Let X be a normed linear space. 

A set C C X is closed if whenever (c„) is a sequence in C that is a convergent 
sequence in X, the limit lim„_ i . 00 c„ also lies in C. 

A set U C X is open if for every u S U there exists e > such that ||x — u|| < 
e =4> x e U. 

A set S C X is bounded if there is an i? < 00 with the property that x G 
S < fl. 

A set S 1 C A is connected if there do not exist open sets A, B in X with 
SciUB, SnB^0andSnAn5 = 0. 

Associated to any set S C X in a normed space there are sets S° C S C S 
defined as follows: the interior of S is the set 

(4) S° = {xeX\3e>0 such that ||x - y\\ < e y G S 1 }, 



14 



1. NORMED LINEAR SPACES 



and the closure of S 1 , 

(5) S = {x € X I V e > 3s G 5 such that ||s - x|| < e}. 

Exercise 1.24. [1] Prove that a map / : X — >• y is continuous (cf. Definition 
I1.16P if and only if for every open set U C Y, the pre-image / -1 (?7) C X is also 
open. 

[2] Show by example that for a continuous map / : R — >■ K there may be open sets 
(7 for which /(£/) is not open. 

It is clear from first year analysis that closed bounded sets (closed intervals, 
for example) have special properties. For example, recall the theorem of Bolzano- 
Weierstrass. 

Theorem 1.25. Let S be a closed and bounded subset ofR. Then a continuous 
function f : S — > M attains its bounds: there exist £, 77 6 5* with the property that 

/(0 = sup/(«), f(r,) = Mf(a). 
ses s ^ s 

Definition 1.26. A subset S of a normed linear space is (sequentially) compact 
if and only if every sequence (s n ) in S has a subsequence (s n . ) = (s ni , s„ 2 , . . . ) that 
converges in S. 

Recall the following theorem (the Heine-Borel theorem) - which is really the 
same one as Theorem ll.251 

Theorem 1.27. A subset of K" is compact if and only if it is closed and 
bounded. 

By now you should be used to the idea that any such result does not extend to 
infinite-dimensional normed linear spaces: Example 11.18( 2] is a bounded sequence 
with no convergent subsequences. Thus the result Theorem ll.27l does not extend to 
infinite-dimensional normed spaces. However the analogue of Theorem 11.251 does 
hold in great generality. This is also a version of the Bolzano- Weierstrass theorem. 

Theorem 1.28. If A is a compact subset of a normed linear space X, and 
f : X — > y is a continuous map between normed linear spaces, then f(A) is a 
compact subset ofY. 

As an exercise, convince yourself that Theorem 11.281 implies Theorem 11.251 
Some standard sets are used so often that we give them special names. 

Definition 1.29. Let X be a normed space. Then the open ball of radius 
e > and centre xq is the set 

B c (xq) = {x e X I \\x — xq\\ < e. 

The closed ball of radius e > and center xq is the set 

B e (x ) = {x e X I Hz-xoll <e}- 

Exercise 1.30. Open and closed balls in normed spaces are convex (cf. Defi- 
nition [OJ). 

Definition 1.31. A subset S of a normed space X is dense if every open ball 
in X has non-empty intersection with S. A normed space is said to be separable 
if there is a countable set S — {x\, x%, ■ ■ ■ } that is dense in X. 
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10. Quotient spaces 

As an application of Section [9j quotients of normed spaces may be formed. 
Notice that we need both the algebraic structure (subspace of a linear space) and 
a topological property (closed) to make it all work. 

Recall from Definition 11.231 and Definition 11.31 that a closed linear subspace Y 
of a normed linear space X is a subset Y C X that is itself a linear space, with the 
property that any sequence (y n ) of elements of Y that converges in X has the limit 
in Y. 

The linear space X/Y (the quotient or factor space is formed as follows. The 
elements of X/Y are cosets of Y - sets of the form x + Y for x G X . The set of 
cosets is a linear space under the operations 

(si + Y) © (x 2 +Y) = (a* + x 2 ) + Y, X • (x + Y) = Xx + Y. 

Notice that this makes sense precisely because Y is itself a linear space, so for 
example Y + Y = Y and XY = Y for A ^ 0. Two cosets X\ + Y and x 2 + Y arc 
equal if as sets x\ + Y = x 2 + Y, which is true if and only if x\ + x 2 G Y. 

Example 1.32. [1] Let X = R 3 , and let Y be the subspace spanned by (1, 1, 0). 
Then X/Y is a two-dimensional real vector space. There are many pairs of elements 
that generate X/Y, for example 

(1,0,1) + F and (o,o,i) + y. 

[2] The linear space Y of finitely supported sequences in I \ is a linear subspace. The 
quotient space £%/Y is very hard to visualize: its elements are equivalence classes 
under the relation (x n ) ~ (y„) if the sequences (x n ) and (y n ) differ in finitely many 
positions. 

[3] The linear space Y of l\ sequences of the form (0, . . . , 0, x n +i, • ■ • ) (first n are 
zero) is a linear subspace of t\. Here the quotient space l\/Y is quite reasonable: 
in fact it is isomorphic to R™. 

[4] We know that for p, q G [l,oo], p < q => £ p C £ q . This means that for 
any p < q there is a linear quotient space t q /l v . These quotient spaces are very 
pathological. 

[5] The linear space Y = C[0, 1] is a linear subspace of the space X of square- 
Riemmann-integrable functions on [0,1]. The quotient X/Y is again a linear space 
that is impossible to work with. 

[6] Let X = C[Q, 1], and let Y = {/ G X | /(0)}. Then X/Y is isomorphic to R. 

It is clear from these examples that not all linear subspaces are equally good: 
Examples 11.321 [1], [3], and [6] are quite reasonable, whereas [2], [4] and [5] are 
examples of linear spaces unlike any we have seen. The reason is the following: the 
space X/Y is guaranteed to be a normed space with a norm related to the original 
norm on X only when the subspace Y is itself closed. Notice that Examples 11.321 
[1], [3], and [6] are precisely the ones in which the subspace is closed. 

Theorem 1.33. If X is a normed space, and Y is a normed linear subspace, 
then X/Y is a normed space under the norm 

(6) \\x + Y\\= mf\\z\\. 

z£x+Y 

Before proving this theorem, try to convince yourself that the norm ([6]) is the 
obvious candidate: if X = R 2 and Y = (1,0)R, then the space X/Y consists of 
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lines in X of the form (s, t) + Y. Notice that each such line may be written uniquely 
in the form (0, t) + Y, and this choice minimizes the norm of the element of X that 
represents the line. 

PROOF. Let x + Y be any coset of X, and let (x n ) C z + Y be a convergent 
sequence with x n — > x. Then for any fixed n, x n — x m — >• x n — x is a sequence 
in Y converging in X. Since Y is closed, we must have x n — x G Y, so x + Y = 
x n + Y = z + Y. That is, the limit of the sequence defines the same coset as does 
the sequence - the set z + Y is a closed set. 

Assume now that \\x + Y\\ = 0. Then there is a sequence (x n ) C x + Y with 
||x„|| — > 0. Since x + Y is closed and x n — > 0, we must have e x + F, so x + F = F, 
the zero element in X/Y. 

Homogeneity is clear: 

||A(x + r)||= mi \\Xz\\ = \X\ inf ||z|| = |A|||x + r!|. 

zex+Y zex+Y 

Finally, the triangle inequality: 

\\(x 1+ Y) + (x 2 +Y)\\ = inf \\z 1 + z 2 \\ 

z 1 ex 1 + Y;z 2 ex 2 + Y 

< inf ||zij| + inf ||2r 2 || 
ziexi+Y z 2 ex 2 +Y 



\X! +Y\\ + \\x 2 + Y\ 



□ 



Example 1.34. Even if the subspace is closed, the quotient space may be 
a little odd. For example, let c denote the space of all sequences (x n ) with the 
property that lim„ x n exists. This is a closed subspace of ^ 00 . What is the quotient 

too /C? ' 



CHAPTER 2 



Banach spaces 

It turns out to be very important and natural to work in complete spaces - 
trying to do functional analysis in non-complete spaces is a little like trying to do 
elementary analysis over the rationals. 

Definition 2.1. A complete normed linear space is callecfl a Banach space. 

Example 2.2. [1] We are already familiar with a large class of Banach spaces: 
any finite-dimensional normed linear space is a Banach space. In our notation, this 
means that R is a Banach space for all 1 < p < oo and all n. 
[2] The space of continuous functions with the sup norm is a Banach space (cf. 
Example IT"TT[ 4] and Example \TTZ[ 2] . 

[3] The sequence space £ p is a Banach space. To see this, assume that (x n ) is a 
Cauchy sequence in £ p , and write 

r _ ( r (i) T (2) \ 

Recall that || • || p > || • ||oo for all p (cf. Example |1.11| 3]). So, given e > we may 
find N with the property that 

m,n>N =>■ \\x n — x m \\ p < e 

which in turn implies that \\x n — a; m ||oo < e, so for each k, \xi^ — \ < £■ That 
is, if (x n ) is a Cauchy sequence in £ p , then for each k (x^) is a Cauchy sequence 
in M. Since M is complete, we deduce that for each k we have x^ — > ■ Notice 
that this does not imply by itself that x n — > y (cf. Example I1.18| 2]). However, if 
we know (as we do) that (x n ) is Cauchy, then it does: we prove this for p < oo but 
the p = oo case is similar. Fix e > 0, and use the Cauchy criterion to find N such 
that n,m > N implies that 

oo 

^ ; \ x ii ' — X rrJ \ P ^ e - 
fe=l 

Now fix n and let m — >• oo to see that 

k=l 

(notice that < has become <). This last inequality means that 

||x„-2/|| p <e 1/p , 

^After the Polish mathematician Stefan Banach (1892—1945) who gave the first abstract 
treatment of complete normed spaces in his 1920 thesis (Fundamenta Math., 3, 133-181, 1922). 
His later book (Theorie des operations Jineaires, Warsaw, 1932) laid the foundations of functional 
analysis. 
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showing that in £ p , x n -> y = (y {1 \yf\. ..). 

Lemma 2.3. Let (x n ) be a sequence in a Banach space. If the series Yl^Li x « 
is absolutely convergent, then it is convergent. 

Recall that absolutely convergent means that the numerical series Y^n°=i \\ x n\\ 
is convergent. The lemma is clearly not true for general normed spaces: take, for 
example, a sequence of functions in C[0, 1] each with ||/ n ||2 = with the property 
that Y^=\fn is n °t continuous. 

Proof. Consider the sequence of partial sums s m = Y^=i x « : 

m 

n« m -*fcii < im -> 

n=k+l 

as m > k — y oo . It follows that the sequence (s m ) is Cauchy; since X is complete 
this sequence converges, so the series Y^n°=i x « converges. □ 

1. Completions 

Completeness is so important that in many applications we deal with non 
complete normed spaces by completing them. This is analogous to the process of 
passing from Q to K by working with Cauchy sequences of rationals. In this section 
we simply outline what is done. In later sections we will see more details about 
what the completions look like. 

Let X be a normed linear space. Let C(X) denote the set of all Cauchy se- 
quences in X. An element of C(X) is then a Cauchy sequence (x n ). The linear 
space structure of X extends to C(X) by defining a ■ (x n ) + (y n ) — (ax n +y n ). The 
norm || • || on X extends to a semi-norm onC(X), defined by 

||(x„)|| = lim ||x„||. 

n— >oo 

Finally, define an equivalence relation ~ on C(X) by (x n ) <~ (y n ) if and only if 
x n — yn 0. Then the linear space operations and the semi-norm are well-defined 
on the space of equivalence classes C(X)/ <~, giving a complete normed linear space 
X called the completion of X. 

Exercise 2.4. [1] Apply the process outlined above to the rationals Q. Try to 
see why the obvious extension of the norm to the space of Cauchy sequences only 
gives a semi-norm. 

[2] Construct a Cauchy sequence (/„) in (C[0,1],|| • H2) with the property that 
In 7^ for any n but ||/ n ||2 — > 0. This means that the Cauchy sequence (/„) and 
the Cauchy sequence (0) are not separated by the semi-norm j| • || 2 , showing it is 
not a norm. 

[3] Show that if X is already a Banach space, then there is a bijective isomorphism 
between X and X. 

It should be clear from the above that it is going to be difficult to work with 
elements of the completions in this formal way, where an element of X is an equiv- 
alence class of Cauchy sequences. However all we will ever need is the simple 
statement: for any normed linear space X, there is a Banach space X such that X 
is isomorphic to a dense subspace i(X) of X; the map 1 from X into X preserves 
all the linear space operations. 



2. CONTRACTION MAPPING THEOREM 



19 



Example 2.5. [1] We have seen that C[0, 1] under the 2-norm is not complete 
(cf. Example 11.22( 2] ). Similar examples will show that C[0, 1] is not complete 
under any of the p-norms. Let X denote the non-complete space (C[0, 1], | • \\ p ), 
A reasonable guess for X might be the space of Riemann-integrable functions with 
finite p-norm, but this is still not complete. It is easy to construct a Cauchy 
sequence of Riemann-integrable functions that does not converge to a Riemann- 
integrable function in the p-norm. However, if you use Lebesgue integration, you 
do get a complete space, called L p [0, 1]. For now, think of this space as consisting of 
all Riemann-integrable functions with finite p-norm together with extra functions 
obtained as limits of sequences of Riemann-integrable functions. Then L p provides 
a further example of a Banach space. 

[2] A function / : X — > Y is said to have compact support if it is zero outside 
some compact subset of X; the support of / is the smallest closed set containing 
{x G X | f(x) ^ 0}. This example is of importance in distribution theory and 
the study of partial differential equations. Let C^°(f2) be the space of infinitely 
differentiable functions of compact support on f2, an open subset of R n . Recall the 
definition of higher-order derivatives D a from Example II .2{ 8). For each k £ N and 
1 < p < oo define a norm 




This gives an infinite family of (different) normed space structures on the linear 
space Cfi°(tt). None of these spaces are complete because there are sequences of 
C°° functions whose (n,p)-limit is not even continuous. The completions of these 
spaces are the Sobolev spaces. 

2. Contraction mapping theorem 

In this section we prove the simplest of the many fixed-point theorems. Such 
theorems are useful for solving equations, and with the formalism of function spaces 
one uniform treatment may be given for numerical equations like x — cos(x) and 
differential equations like = x + tan(sy), y(0) = tjq. 

Exercise 2.6. If you have an electronic calculator, put it in "radians" mode. 
Starting with any initial value, press the cos button repeatedly. What happens? 
Can you explain why this happens? (Draw a graph) How does this relate to the 
equation x = cos(a;). 

Definition 2.7. A map F : X — >• Y between normed linear spaces is called a 
contraction if there is a constant K < 1 for which 

(7) \\F(x) - F(y)\\ Y < K ■ \\x - y\\ x 

for all x, y € X. 

Exercise 2.8. [1] Any contraction is uniformly continuous. 
[2] If / : [a, b] — > [a, b] has the property that \ f(x) — f(y)\ < \x — y\ then / is a 
contraction. 

[3] Find an example of a function / : K — > R that has the property \f{x) — f{y)\ < 
\x — y\ for all x, y € M, but / is not a contraction. 
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Theorem 2.9. If F : X X is a contraction and X is a Banach space, then 
there is a unique point x* G X which is fixed by F (that is, F(x) = x). Moreover, 
if x$ is any point in X , then the sequence defined by x\ = F(xq), xi = F(x\), . . . 
converges to x* . 

Corollary 2.10. If S is a closed subset of the Banach space X , and F : S — > S 
is a contraction, then F has a unique fixed point in S . 

Proof. Simply notice that S is itself complete (since it is a closed subset of 
a complete space) , and the proof of Theorem 12.91 docs not use the linear space 
structure of X. □ 

Corollary 2.11. If S is a closed subset of a Banach space, and F : S — > S 
has the property that for some n there is a K < 1 such that 

\\F n (x)-F n (y)\\ Y <K-\\x-y\\ x 

for all x, y € S , then F has a unique fixed point. 

Proof. Choose any point xo € S. Then by Corollary 12.101 we have 

x = lim F kn x , 

k— too 

where x is the unique fixed point of F n . By the continuity of F, 

Fx = lim FF kn x . 

k— f oo 

On the other hand, F n is a contraction, so 

\\F kn Fx - F kn x \\ < K\\F^- 1)n Fx ,F^-^ n x \\ <■■■< K k \\F(x ) - x \\, 

so 

\\F(x) - x|| = lim \\FF kn x - F kn x \\ = 0. 

k— too 

It follows that F(x) = x so x is a fixed point for F. This fixed point is automatically 
unique: if F has more than one fixed point, then so does F n which is impossible 
by Corollary l2~T0l □ 

Exercise 2.12. [1] Give an example of a map / : R — > R which has the property 
that \ f(x) — f(y)\ < \x — y\ for all x, y € R but / has no fixed point. 

[2] Let / be a function from [0, 1] to [0, 1]. Check that the contraction condition 
© holds if / has a continuous derivative /' on [0, 1] with the property that 

\f'(x)\<K<l 

for all x £ [0,1]. As an exercise, draw graphs to illustrate convergence of the iterates 
of / to a fixed point for examples with < f'{x) < 1 and —1 < f'(x) < 0. 

Example 2.13. A basic linear problem is the following: let F : R" — > R™ be 
the affine map defined by 

F(x) = Ax + b 



Iteration of continuous functions on the interval may be used to illustrate many of the fea- 
tures of dynamical systems, including frequency locking, sensitive dependence on initial conditions, 
period doubling, the Feigenbaum phenomena and so on. An excellent starting point is the article 
and demonstration "One— dimensional iteration" at the web site http://www.geom.umn.edu/java/. 
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where A = (cii/) is an n x n matrix. Equivalently, F{n) = y, where 



3=1 

for i = 1, . . . , n. If F is a contraction, then we can apply Theorem l2.9l to solvc@ the 
equation -F(x) = x. The conditions under which F is a contraction depend on the 
choice of norm for R™. Three examples follow. 
[1] Using the max norm, ||x||oo = max{|xi|}. In this case, 

||-F(x) - F(x)|| tx) = max Vay^-Jj 

i A — * 

j 

< max I 

3 

< max \ \a,ij | max \ xj — Xj 

i. < J 4 



E 



max > a 



Thus the contraction condition is 

(8) M < K < 1 for * = !. 

i 

[2] Using the 1-norm, ||x||i = Y^h=i \ Xi \- ^ n ^his case, 



|f(x)-f(x)||i = X! 



3 

- i^f^j\ a ^ ll X ~ X 



The contraction condition is now 

(9) J2\ a »\ < K < 1 for i = l,..-,n. 

1/2 

[3] Using the 2-norm, ||x|| 2 = (SiLiiW 2 ) ■ ^ n this case 
\\F(x)-F{Z)\\l = ElE^fe-^) 



* EE4 n x --i 



^Of course the equation is in one sense trivial. However, it is sometimes of importance 
computationally to avoid inverting matrices, and more importantly to have an iterative scheme 
that converges to a solution in some predictable fashion. 
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The contraction condition is now 

(io) EEi4i<^< 1 - 

i j 

It follows that if any one of the conditions (JSJ), ([9]), or (|T0|) holds, then there 
exists a unique solution in l n to the afhne equation Ax + b = x. Moreover, 
the solution may be approximated using the iterative scheme xi = i ? (xr ) ),X2 = 
F( Xl ),.... 

Notice that each of the conditions (JSJ), ([5]), (fTU)) is sufficient for the method to 
work, but none of them are necessary. In fact there are examples in which exactly 
one of the condition holds for each of them conditions. 

It remains only to prove the contraction mapping theorem. 

Proof. Proof of Theorem 12 .91 Given any point xq € X, define a sequence (x n ) 

by 

xi = F(x ),x 2 = F(xi), 

Then, for any n < m we have by the contraction condition (J7J 

\\x n -x m \\ = \\F n x - F m x \\ 

< K n \\x -F m ~ n x \\ 

< K n (\\x - xi|| + || xi - x 2 || H h ||x m _ n _i - x m _ n ||) 

< K n \\x - xi|| (1 + K + K 2 + ■ ■ ■ + iT"-™- 1 ) 

K n 

< 1 _ k \\xq-x x \\. 

Now for fixed xq, the last expression converges to zero as n goes to infinity, so (cf. 
Definition II. 20[) the sequence (x n ) is a Cauchy sequence. 

Since the linear space X is complete (cf. Definition ll.21j) . the sequence x n 
therefore converges, say 

x* = lim Xn- 

x— >-oo 

Since F is continuous, 

F(x*) = F ( lim x n \ = lim F{x n ) = lim x n+ i — x* , 

\n— >-cjo / n— foo n— >oo 

so F has a fixed point x* . To prove that x* is the only fixed point for F, notice 
that if F(y) = y say, then 

\\x* - y\\ = \\F(x*) - F(y)\\ < K\\x* - y\\, 

which requires that x* = y since K < 1. □ 

3. Applications to differential equations 

As mentioned before, the most important applications of the contraction map- 
ping method are to function spaces. We have seen already in Example I1.17| 3] 
that fixed points for certain integral operators on function spaces are solutions of 
ordinary differential equations. The first result in this direction is due to Picard3. 



^(Charles) Emile Picard (1856-1941), who was Professor of higher analysis at the Sorbonne 
and became permanent secretary of the Paris Academy of Sciences. Some of his deepest results 
lie in complex analysis: 1) a non— constant entire function can omit at most one finite value, 2) 
a non-polynomial entire function takes on every value (except the possible exceptional one), an 
infinite number of times. 
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Theorem 2.14. Let f : G — > R be a continuous function defined on a set G 
containing a neighbourhood {(x,y) | ||(x, j/) — (xo,yo)||oo < e} of (xo,yo) for some 
e > 0. Suppose that f satisfies a Lipschitz condition of the form 

(11) \f(x,y)-f(x,y)\<M\y-y\ 

in the variable y on G. Then there is an interval (xq — 5, xq + S) on which the 
ordinary differential equation 

(12) ! = /(*,„) 

has a unique solution y — 4>(x) satisfying the initial condition 

(13) <p{x Q ) = y . 

Proof. The differential equation (|12p with initial condition (fT5|) is equivalent 
to the integral equation 

(14) 4>(x)=y + f f(t,<f>(t))dt. 

Jx 

Since / is continuous there is a bound 

(15) \f(x,y)\<R 

for all (x, y) with || (x, y) — (xo, j/o)||oo < e' for some e' > 0. Choose 5 > such that 

(1) |x - x \ < 5, \y- 2/0 1 < RS together imply that \\(x,y) - (x ,yo)||oo < e'; 

(2) MS < 1 where M is the Lipschitz constant in (fTl"]) . 

Let S be the set of continuous functions <fr defined on the interval |x — xo| < 8 
with the property that \4>(x) — yo\ < RS, equipped with the sup metric. The set 
S is complete, since it is a closed subset of a complete space. Define a mapping 
F : S — > S by the equation 



(16) (F(4>)) (x) = y + / f(t,0(t))dt. 

J Xq 

First check that F does indeed map S into S: if <f> G S 1 , then 
|F0(x)-y o | = / f(t,4>(t))dt 

Jx 

< r \f(t,<t>(t))\dt 

J x 

< R\x~x \<R5 

by ([15]). so F(^>) G 5. Moreover, 

|^(x)-^(x)| < / \f(t,<P(t))-f(t,fct))\dt 

Jxq 

< MS max \(f>(x) - <f>{x)\, 

X 

so that 

\\F{4>)-F{$)\\<M6\\<l>-4\\, 
after taking sups over x. By construction, < 1, so that F is a contraction 
mapping. It follows from Corollary 12.111 that the operator F has a unique fixed 
point in S, so the differential equation (Tl2|) with initial condition (p~3|) has a unique 
solution. □ 
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The conditions on the set G used in Theorem 12.141 arise very often so it is 
useful to have a short description for them. A domain in a normed linear space 
X is an open connected set (cf. Definition ll.23[) . An example of a domain in R 
containing the point a is an interval (a — 8, a + 6) for some S > 0. Notice that 
if G is a domain in (X, || • \\x), and a G G then for some e > the open ball 
B e (a) ={x€X\\\x- a\\ x < e} lies in G (cf. Definition O^)) . 

Picard's theorem easily generalises to systems of simultaneous differential equa- 
tions, and we shall see in the next section that the contraction mapping method 
also applies to certain integral equations. 

Theorem 2.15. LetG be a domain mR n+1 containing the point (xoiZ/oij . . . ,yo n ), 
and let fx, ... , /„ be continuous functions from G to R each satisfying a Lipschitz 
condition 

(17) \fi(x,yi, ■ ■ -,y„) - f l (x,y 1 ,...,y n )\ < M max \y t - y t \ 

l<t<n 

in the variables y±, . . . ,y n . Then there is an interval (xq — 5, xq + S) on which the 
system of simultaneous ordinary differential equations 

(18) = fi(x,yi,...,y n ) for i = l,...,n 
has a unique solution 

yi = <pi(x), ...,y n = <p n (x) 
satisfying the initial conditions 

(19) <p!(xo) =y 01 ,...,<j) n (xo) =y Qn . 

Proof. As in the proof of Theorem l2.14[ write the system defined by (|18p and 
(|19p in integral form 



(20) 0i(x)=yoi+ / fi(t,4>i(t),...,<t> n (t))dt for i = l,...,n. 

Since each of the functions fi is continuous on G, there is a bound 

(21) \fi(x, yi ,...,y n )\<R 

in some domain G' C G with G' 3 (xq, yoi, . . . , yon)- Choose 6 > with the 
properties that 

(1) \x — xo\ < 5 and max^ \yi — y^\ < RS together imply that (x, yx, . . . , y n ) £ G'; 

(2) MS < 1. 

Now define the set S to be the set of n-tuples (<f>i, ...,</>„) of continuous func- 
tions defined on the interval [xo — 5,xq + S] and such that \4>i(x) — yoz| < R$ f° r all 
i = 1, . . . , n. The set S may be equipped with the norm 

\\(j>- 0|| = max|^i(x) - 4>i(x)\. 

It is easy to check that S is complete. The mapping F defined by the set of integral 
operators 

(F((p)) i (x) = y 0i + I fi(t,<f>i(t),...,4> n (t))dt for \x - x \ < 8,i = 1, . . . ,n 

is a contraction from S to itself. To see this, first notice that if 
4> — (0i, ...,4> n ) G S*, and \x - x \ < S 
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then 



\</>i(x) ~ Voi 



fi{t, Mt),---,Mt))dt 



< R5 for i = 1, . . . , n 



by ([2T]|. so that F((f>) — (F(<p)i, . . . , F(<j>) n ) lies in 5. It remains to check that F is 
a contraction: 

I ( i? (<^))i ( x ) - ( F (4>)) (jn)\ < f \Mt,Mt),---,Mt))-Mt,Mt),---,k(t))\dt 

V ' 1 Jx 

< M5m&x\<j)i(x) — 4>i{x)\; 

i 

after maximising over x and i we have 

||F(^)-F(^)|| <M6\\<t>-4>\\, 

so F : S — > S is a contraction. It follows that the equation (|20l) has a unique 
solution, so the system of differential equations (|18p and (fT^)) has a unique solution. 

□ 



4. Applications to integral equations 

Integral equations may be a little less familiar than differential equations (though 
we have seen already that the two are intimately connected), so we begin with some 
important examples. The theory of integral equations is largely modern (twentieth- 
century) mathematics, but several specific instances of integral equations had ap- 
peared earlier. 

Certain problems in physics led to the need to "invert" the integral equation 
(22) g{x) = -L f° e ix yf(y)dy 

V III J-oo 

for functions / and g of specific kinds. This was solved - formally at least - by 
FourieJl in 1811, who noted that ([^t requires that 

f(x) = -L (°° e-^g(y)dy. 

V J-oo 

We shall see later that this is really due to properties of particularly good Banach 
spaces called Hilbert spaces. 



J Jean Baptiste Joseph Fourier (1768-1830), who pursued interests in mathematics and math- 
ematical physics. He became famous for his Theorie analytique de la Chalcur (1822), a mathe- 
matical treatment of the theory of heat. He established the partial differential equation governing 
heat diffusion and solved it by using infinite series of trigonometric functions. Though these series 
had been used before, Fourier investigated them in much greater detail. His research, initially 
criticized for its lack of rigour, was later shown to be valid. It provided the impetus for later work 
on trigonometric series and the theory of functions of a real variable. 
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AbeH studied generalizations of the tautochronqj problem, and was led to the 
integral equation 

■9^ = /" J^yf dy ^ b € (0 ' 1} ' ^ = 

for which he found the solution 

sin7r6 f y g'(x) 

L (y-x) 



This equation is an example of a Volterr4j equation. 

We shall briefly study two kinds of integral equation (though the second is 
formally a special case of the first). 

Example 2.16. A Fredholm equatioi0 is an integral equation of the form 
(23) f{x) = A / K(x, y)f(y)dy + ct>(x), 



where K and 4> are two given functions, and we seek a solution / in terms of 
the arbitrary (constant) parameter A. The function K is called the kernel of the 
equation, and the equation is called homogeneous if <j> = 0. 

We assume that K(x,y) and <j>{x) are continuous on the square {(x,y) | a < 
x < b,a < y < b}. It follows in particular (see Section ITT9]) that there is a bound 
M so that 

\K{x, y)\< M for all a < x < b, a < y < b. 
Define a mapping F : C[a,b] C[a,b] by 

(24) (F(f)) (x) =\f K(x, y)f(y)dy + <f>(x) 

J a 

Now 

\\F(h) - F(f 2 )\\ = msxlFihXx) - F(f 2 )(x)\ 

X 

< \X\M{b-a)m^\h{x)- f 2 {x)\ 

X 

= |A|M(6-o)||/i-/ a ||, 



''Niels Henrik Abel (1802-1829), was a brilliant Norwegian mathematician. He earned wide 
recognition at the age of 18 with his first paper, in which he proved that the general polynomial 
equation of the fifth degree is insolvable by algebraic procedures (problems of this sort are studied 
in Galois Theory). Abel was instrumental in establishing mathematical analysis on a rigorous 
basis. In his major work, Recherches sur les fonctions clliptiques (Investigations on Elliptic Func- 
tions, 1827), he revolutionized the understanding of elliptic functions by studying the inverse of 
these functions. 

^Also called an isochrone: a curve along which a pendulum takes the same time to make a 
complete osciallation independent of the amplitude of the oscillation. The resulting differential 
equation was solved by James Bernoulli in May 1690, who showed that the result is a cycloid. 

^Vito Volterra (1860-1940) succeeded Beltrami as professor of Mathematical Physics at 
Rome. His method for solving the equations that carry his name is exactly the one we shall 
use. He worked widely in analysis and integral equations, and helped drive Lebesgue to produce a 
more sophisticated integration by giving an example of a function with bounded derivative whose 
derivative is not Riemann intcgrablc. 

^This is really a Fredholm equation "of the second kind" , named after the Swedish geometer 
Erik Ivar Fredholm (1866-1927). 
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so that F is a contraction mapping if 

1 



|A< 



M(b-a)' 

It follows by Theorem 12.91 that the equation (1231) has a unique continuous solution 
/ for small enough values of A, and the solution may be obtained by starting with 
any continuous function /o and then iterating the scheme 

f n +i(x) = A / K(x,y)f n (y)dy + 4>(x). 



Example 2.17. Now consider the Volterra equation, 
(25) f(x)=x[ K(x,y)f(y)dy + ct>(x), 



which only differtOj from the Fredholm equation (1231) in that the variable x appears 
as the upper limit of integration. As before, define a function F : C[a, b] — > C[a, b] 

by 



(F(f))(x) = X / K(x,y)f(y)dy + ^x). 

J a 

Then for /i, f% E C[a, b] we have 

\ F (h){x) - F (h){x)\ = A f ' K{x,y)[f x {y) - f 2 (y)]dy 

J a 

< IMMix-^m^lMx)- f 2 (x)l 

X 

where M — max Xj2/ \K(x,y)\ < oo. It follows that 
|F 2 (A)(x)-F 2 (/ 2 )(x)| 



= A / K{x,y)[F{h)(y) - F{f 2 ){y)]dy 

J a 

< \\\M f \F{h){y) - F(h){y)\dy 

< \X\ 2 M 2 max\f l {x)-f2{x)\f \y-a\dy 



and in general 



\F n {h){x)~F n {h){x)\ < \X\ n M n ^-^-m^\h{x)-f 2 {x)\ 

(b - a) n 

< | A |«m^ ^-max\h(x)-h(x)\. 

nl % 

It follows that 

(b - n} n 

\\F n h - F n h\\ < \X\ n M n{ - p-H/i - M 

so that F n is a contraction mapping if n is chosen large enough to ensure that 

|A|"M" (6 ~, a) " < 1. 

77.' 



*"lf we extend the definition of the kernel K(x,y) appearing in II25I I by setting K(x,y) = 
for all y > x then (1 25 I t becomes an instance of the Fredholm equation {23} . This is not done 
because the contraction mapping method applied to the Volterra equation directly gives a better 
result in that the condition on A can be avoided. 
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It follows by Corollary 12.111 that the equation ([25]) has a unique solution for all A. 



CHAPTER 3 



Linear Transformations 



Let X and Y be linear spaces, and T a function from the set Dt C X into Y. 
Sometimes such functions will be called operators, mappings or transformations. 
The set Dt is the domain of T, and T(Dt) C 7 is the range of T. If the set Dt is 
a linear subspace of X and T is a linear map, 

(26) T(ax + f3y) = aT{x) + f3T(y) for all a, f3 £ R or C, x, y £ X. 

Notice that a linear operator is injective if and only if the kernel {x £ X | Tx — 0} 
is trivial. 

Lemma 3.1. A linear transformation T : X — > Y is continuous if and only if 
it is continuous at one point. 

Proof. Assume that T is continuous at a point a. Then for any sequence 
x n — > a, T(x n ) — > T(a). Let z be any point in X, and y n a sequence with y n — > z. 
Then y n ~ z + a is a sequence converging to a, so T(j/ n — z + a) = T(y n ) — T(z) + 
T(a) -> T(a). It follows that T(y„) -> T(z). □ 

A simple observation that is useful in differential equations, where it is called 
the principle of superposition: if Y^Li a nX n is convergent, and T is a continuous 
linear map, then T (J2n=l a n x n) = Yl^ = i a n Tx n . 

1. Bounded operators 

Example 3.2. Consider a voltage v(t) applied to a resistor R, capacitor C, 
and inductor L arranged in series (an "LCR" circuit). The charge u = u(t) on the 
capacitor satisfies the equation 

T d 2 u ^du 1 

(27) L & +R * + c» = v > 

with some initial conditions say u(0) = 0, f|(0) = 0. Assuming that R 2 > AL/C, 
then the solution of (j2"7)l is 

(28) = y fc(t - s)v(s)ds, 
where 

fc(t) 







L(Ai - A 2 ) 

and Ai, A2 are the (distinct) roots of LA 2 + R\ + ^ = 0. 

This problem may be phrased in terms of linear operators. Let X — C[0, 00); 
then the transformation defined by T(v) — u in (|28p is a linear operator from X to 
X. 



2!) 
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Similarly, (|27p can be written in the form S(u) = v for some linear operator 
S. However, S cannot be defined on all of X - only on the dense linear subspace 
of twice-differentiable functions. The transformations T and S are closely related, 
and we would like to develop a framework for viewing them as inverse to each other. 

Definition 3.3. A linear transformation T : X — > Y is (algebraically) invert- 
ible if there is a linear transformation S : Y — > X with the property that TS = ly 
and ST = l x . 

For example, in Example 13.21 if we take X = C[0, oo) and Y — C 2 [0, oo), then 
T is algebraically invertible with T _1 = S. 

Definition 3.4. A linear operator T : X — > Y is bounded if there is a constant 
K such that 

\\Tx\\ Y < K\\x\\ x for all x e X. 
The norm of the bounded linear operator T is 

(29) ||T||=sup{»' 

Example 3.5. In Example 13.21 the operator T is bounded when restricted to 
any C[0, a] for any a, since 

\Tv(s)\< I \k(t-s)\-\v(s)\ds, 

which shows that 



i) 



l|r«lloo < a sup \k(t)\\\v\\oo < ■ ■ 

0<t<a L\M — Aa| 

The operator S 1 is not bounded of course - think about what differentiation does. 

Exercise 3.6 (1). Show that ||T|| = sup M=1 {||Tx|| y }. 
[2] Prove the following useful inequality: 

(30) \\Tx\\ Y <\\T\\-\\x\\ x for all x G X. 

Theorem 3.7. A linear transformation T : X — > Y is continuous if and only 
if it is bounded. 

PROOF. If T is bounded and x„ — > 0, then by Definition 13. 41 Tx n -» also. It 
follows that T is continuous at 0, so by Lemma 13. II T is continuous everywhere. 

Conversely, suppose that T is continuous but unbounded. Then for any neN 
there is a point x n with ||Tx n || > n||a; n ||. Let y n — n *™ □ , so that y n — > as n — > 00. 
On the other hand, ||Tj/ n || > 1 and T(0) = 0, contradicting the assumption that T 
is continuous at 0. □ 

2. The space of linear operators 

The set of all linear transformations X — >• Y is itself a linear space with the 
operations 

(T + S)(x) =Tx + Sx, (XT)(x)=XTx. 

Denote this linear space by C(X, Y). If X and Y are normed spaces, denote by 
B(X,Y) the subspace of continuous linear transformations. If X = Y, then write 
C(X, X) = C(X) and B(X, X) = B(X). 
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Lemma 3.8. Let X and Y be normed spaces. Then B(X, Y) is a normed linear 
space with the norm i29\) . If in addition Y is a Banach space, then B{X, Y) is a 
Banach space. 

Proof. We have to show that the function T H > \\T\\ satisfies the conditions 
of Definition 11.71 

(1) It is clear that ||T|| > since it is defined as the supremum of a set of non- 
negative numbers. If ||T|| = then ||Tx||y = for all x, so Tx = for all x - that 
is, T = 0. 

(2) The triangle inequality is also clear: 

||T + 5||= sup \\{T + S)x\\< sup ||Tx|| + sup \\Sx\\ = ||T|| + ||£||. 

||a;|| — 1 ||x||— 1 ll^ll— 1 

(3) ||AT|| = sup, w|=1 ||(AT)x|| = |A| sup ||x||=1 ||Tx|| = |A|||T||. 

Finally, assume that Y is a Banach space and let (T n ) be a Cauchy sequence in 
B(X, Y). Then the sequence is bounded: there is a constant K with ||T n x|| < K\\x\\ 
for all x £ X and n > 1. Since \\T n x — T m x\\ < \\T n — T m \\ \\x\\ — > as n > m — > oo, 
the sequence (T n x) is a Cauchy sequence in Y for each x € X. Since Y is complete, 
for each x £ X the sequence (T n x) converges; define T by 

Tx — lim T n x. 

n— >oo 

It is clear that T is linear, and ||Tz|| < K\\x\\ for all x, so T G B(X,Y). 

We have not yet established that T„ ->• T in the norm of B(X,Y) (cf. |29]> , 
Since (T„) is Cauchy, for any e > there is an such that 

||T m - T n \\ < e for all m > n > N. 

For any x £ X we therefore have 

II -T m x T n x 

Take the limit as m — > oo to see that 

||Tx - T n x\\ < e\\x\\, 

so that \\T - T n \\ <eiin>N. This proves that ||T - T„|| ->• as ra -> oo. □ 

EXAMPLE 3.9. Once the space of linear operators is known to be complete, we 
can do analysis on the operators themselves. For example, if X is a Banach space 
and A £ B(X), then we may define an operator 

e A = I + A+±Ai + ±A 3 + .... 

which makes sense since 

\\e A \\ < l + ||A|| + lp|| 2 + ... 
< ell^ll. 

This is particularly useful in linear systems theory and control theory; if x(t) £ M™ 
then the linear differential equation % = Ax(t), x(0) — xq, where A is an n x n 
matrix, has as solution x(t) = e At xo- 
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3. Banach algebras 

In many situations it makes sense to multiply elements of a normed linear space 
together. 

Definition 3.10. Let X be a Banach space, and assume there is a multiplica- 
tion (a:, y) xy from IxI-}I such that addition and multiplication make X 
into a ring, and 

\\ x y\\ < NIIMI- 

Then X is called a Banach algebra. 

Recall that a ring does not need to have a unit; if X has a unit then it is called 
unital. 

EXAMPLE 3.11. [1] The continuous functions C[0, 1] with sup norm form a 
Banach algebra with {fg)(x) = f(x)g(x). 
[2] If X is any Banach space, then B(X) is a Banach algebra: 

||5T||= sup ||(S7>||= sup \\S(Tx)\\ <\\S\\ sup ||Tx|| = ||S||||T||. 

||a:||=l ||a:||— 1 — 1 

The algebra has an identity, namely I{x) = x. 

[3] A special case of [2] is the case X = W 1 . By choosing a basis for R" we may 
identify B{W L ) with the space of n x n real matrices. 

In the next few sections we will prove the more technical results about linear 
transformations that provide the basic tools of functional analysis. 

4. Uniform boundedness 

The hrst theorem is the principle of uniform boundedness or the Banach- 
Steinhaus theorem. 

Theorem 3.12. Let X be a Banach space and let Y be a normed linear space. 
Let {T a } be a family of bounded linear operators from X into Y . If, for each x G X , 
the set {T a x} is a bounded subset ofY, then the set {\\T a \\} is bounded. 

Proof. Assume hrst that there is a ball B € (xq) on which {T a x} (a set of 
functions) is uniformly bounded: that is, there is a constant K such that 

(31) ||T a a;|| < K if ||a;-xo|| < e- 

Then it is possible to find a uniform bound on the whole family {||T„||}. For any 
y 7^ define 

e 

\\y\\ 

Then z € B e (xo) by construction, so (|3~Tj) implies that ||T Q z|| < K. 
Now by linearity of T a this shows that 



r-7[\\T a y\\ - ||T a a:o|| < 

\y\\ 



-ri[T a y + T a x 

\y\\ 



= \\T a z\\ < K, 



which can be solved for ||T Q ?/|[ 



\T a y\\ < \\y\\ < ||y|| 
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where K' — sup Q ||T Q xo|| < oo. It follows that 

K + K' 



TJ < 



e 

as required. 

To finish the proof we have to show that there is a ball on which property pip 
holds. This is proved by a contradiction argument: assume for not that there is no 
ball on which (|31[) holds. Fix an arbitrary ball Bq. By assumption there is a point 
Xi € Bq such that 

||T Ql xi|| > 1 

for some index a\ say. Since each T a is continuous, there is a ball B £l (x%) in which 
ll^aiC^i)!! > 1- Assume without loss of generality that e% < 1. By assumption, in 
this new ball the family {T a x} is not bounded, so there is a point X2 € B ei (xi) 
with 

\\T a2 x 2 \\ > 2 

for some index a 2 ^ a%. Continue in the same way: by continuity of a 2 there is 
a ball B e2 (x 2 ) C B Cl {x\) on which ||T 0a x|| > 2. Assume without loss of generality 
that e 2 < \. 

Repeating this process produces points £3, £4, £5, . . . , indices 0:3, a^, a$, . . . , 
and positive numbers £3,64,65,... such that B tn (x n ) C B Cn _ 1 (x n -i), e„ < ^, all 
the Oij's are distinct, and 

||Ta„x|| > n for all x e B €n [x n ). 

Now the sequence (x n ) is clearly Cauchy and therefore converges to z € X say 
(equivalently, prove that H^Li B tn {x n ) contains the single point z). 

By construction, ||T Qn z|| > n for all n > 1, which contradicts the hypothesis 
that the set {T a z} is bounded. □ 

Recall the operator norm in Definition 13.41 Corresponding to this norm there 
is a notion of convergence in B(X, Y): we say that a sequence (T„) is uniformly 
convergent if there is T € B(X,Y) with ||T„ — T|| — > as n — > 00 (so uniform 
convergence of a sequence of operators is simply convergence in the operator norm). 

Definition 3.13. A sequence (T„) in B(X,Y) is strongly convergent if, for 
any x E X, the sequence (T n x) converges in Y. If there is a T € B(X, F) with 
lim„ T„a; = Tx for all a: G X, then (T n ) is strongly convergent to T. 

Exercise 3.14 (1). Prove that uniform convergence implies strong conver- 
gence. 

[2] Show by example that strong convergence does not imply uniform convergence. 

Theorem 3.15. Let X be a Banach space, and Y any normed linear space. If 
a sequence (T„) in B{X, Y) is strongly convergent, then there exists T € B(X, Y) 
such that (T„) is strongly convergent to T. 

Proof. For each x £ X the sequence (T n x) is bounded since it is convergent. 
By the uniform boundedness principle (Theorem 13. 121) . there is a constant K such 
that ||T n || < K for all n. Hence 

(32) ||T n a;|| < if||x|| for all x e X. 
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Define T by requiring that Tx = linin^oo T n x for all x G X. It is clear that T is 
linear, and (|32|) shows that ||Tx|| < if ||x|| for all x £ X, showing that T is bounded. 
The construction of T means that (T n ) converges strongly to T . □ 

5. An application of uniform boundedness to Fourier series 

This section is an application of Theorem 13.121 to Fourier analysis. We will 
encounter Fourier analysis again, in the context of Hilbert spaces and L 2 functions. 
For now we take a naive view of Fourier analysis: the functions will all be continuous 
periodic functions, and we compute Fourier coefficients using Riemann integration. 



Lemma 3.16. 

sin(n + h)x 



o 



sin ^x 



dx — > oo as n — > oo. 



PROOF. Recall that | sin(x)| < |x| for all x. It follows that 

f 27T sin(n+ ±)x , f 27T 2, . , 1 

/ 3— ^ — ax > — sm(n H — )x\ax. 

Jo sin^x J x 2 

Now | sin(n + \)x\ > | for all x with (n + ^)x between kir + ^ir and kir + ^tt for 
k = 1, 2, . . . . It follows (by thinking of the Riemann approximation to the integral) 
that 

2 " 2, . , L ^ /TrCfc + iV 1 1 / 1\ ^ 1 

- sinfn + -)x dx > > =- n+- > r ^ oo 

as n — )• 00. □ 

Definition 3.17. If / : (0,27r) — > M is Riemann-integrablc, then the Fourier 
series of / is the series 

S (x) - V a m e imx , where a m = — /(£)e- im ^. 

— 27T /n 

Extend the definition of / to make it 27r-periodic, so f(x + 2ir) = /(x) for all 
x. Define the nth partial sum of the Fourier series to be 

n 

^n(x) ^ Qmf- 

m=—n 

The basic questions of Fourier analysis are then the following: is there any relation 
between s(x) and /(x)? Does the function s n (x) approximate /(x) for large n in 
some sense? 

sin(n+^):c 



Lemma 3.18. Le^DJx) = s ±^+j>_ Then 

1-1 v ' sin -kx 



2;r 



1 

PROOF. Exercise. Z 



s n(y) = 75-/ f(y + x)D n (x)dx. 







^This function is called the Dirichlet kernel. For the lemma, it is helpful to notice that 
D n (x) = n e %JX . If you read up on Fourier analysis, it will be helpful to note that the 

Dirichlet kernel is not a "summability kernel" . 
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Now let X be the Banach space of continuous functions / : [0, 2ii\ — > M with 
/(0) = /(2tt), with the uniform norm. 

Lemma 3.19. The linear operator T n : X —> M. defined by 

,-2tt 



is bounded, and 

<-2tx 



Tn(f) = f f(x)D n (x)dx 
27T Jo 

\\ T n\\ = ^- \D n {x)\dx. 
27T Jo 



PROOF. For any / G X, 

r 2-ir j r ln 

2^ 



T„(/)| < ± \f{x)\\D n {x)\dx< i-H/ll / \Djr)\d,. 



so 



\\T n \\ < ±- I \D n (x)\dx. 
27T Jo 

Assume that for some 5 > we have 

1 f 2 * 

(33) ||T„|| = — / \D n (x)\dx-S. 

27T Jo 

Then since for fixed n \D n (x)\ < M n is bounded, we may find a continuous function 
/„ that differs from sign(Z?„(a;)) on a finite union of intervals whose total length 
does not exceed jj-S. Then (don't think about this - just draw a picture) 

f„(x)D n (x)dx\ > 1- f \D n (x)\dx-S, 
which contradicts the assumption (|3"3"|) . We conclude that 

2tt 



\ T n\\ = ^J Q \Dn(x)\dx 



□ 



We are now ready to see a genuinely non-trivial and important observation 
about the basic theorems of Fourier analysis. 

Theorem 3.20. There exists a continuous function f : [0, 27r] — > R, with f(0) — 
/(27r), such that its Fourier series diverges at x = 0. 

PROOF. By Lemma |3"7T%I we have 

T n {f) = S„(0) 

for all / € X. Moreover, for fixed / € X, if the Fourier series of / converges at 0, 
then the family {T n f} is bounded as n varies (since each element is just a partial 
sum of a convergent series). Thus if the Fourier series of / converges at for all 
f E X, then for each f £ X the set {T n f} is bounded. By Theorem 13.121 this 
implies that the set {||T„||} is bounded, which contradicts Lemma T3. 191 

The conclusion is that there must be some / € X whose Fourier series does not 
converge at 0. □ 
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Exercise 3.21. The problem of deciding whether or not the Fourier series of a 
given function converges at a specific point (or everywhere) is difficult and usually 
requires some degree of smoothness (differentiability). You can read about various 
results in many books - a good starting point is Fourier Analysis, Tom Korner, 
Cambridge University Press (1988). 

It is more natural in functional analysis to ask for an appropriate semi-norm 
in which \\s(x) — f(x)\\ = for some class of functions /. 

6. Open mapping theorem 

Recall that a continuous map between normed spaces has the property that the 
pre-image of any open set is open, but in general the image of an open set is not 
open (Exercise 1 1.24j) . Bounded linear maps between Banach spaces cannot do this. 

Theorem 3.22. Let X and Y be Banach spaces, and let T be a bounded linear 
map from X onto Y . Then T maps open sets in X onto open sets in Y . 

Of course the assumption that X maps onto Y is crucial: think of the projection 
(x,y) H> (x,0) from R 2 — > M 2 . This is bounded and linear, but not onto, and 
certainly cannot send open sets to open sets. 

The proof of the Open-Mapping theorem is long and requires the Baire category 
theorem, so it will be omitted from the lectures. For completeness it is given here 
in the next three lemmas. 

Some notation: use and Bj to denote the open balls of radius r centre 
in X and Y respectively. 

Lemma 3.23. For any e > 0, there is a 5 > such that 

(34) TBI ^ Bj. 

PROOF. Since X = \J™ =1 nB? , and T is onto, we have Y = T(X) = \J™ =1 nTB* . 
By the Baire category theorem (Theorem I A. 71) it follows that, for some n, the set 
nTB* contains some ball B^ (z) in Y. Then TB* must contain the ball Bj(yo), 
where y — ^z and S = ~r. It follows that the set 

P = {yi - 2/2 | Vi G Bj(y ), y 2 G Bj (y )} 

is contained in the closure of the set TQ, where 

Q = {x 1 -x 2 \x 1 £ B*,x 2 G Bf } C B£. 

Thus, TB^ C P. Any point y € Bj can be written in the form y = (y + yo) ~ yo, 
so Bj C P. and (O follows. □ 

Lemma 3.24. For any eo > there is a 5$ > such that 

(35) TB* o DBi o . 

Proof. Choose a sequence (e„) with each e„ > and Y^?=i e « < e o- By 
Lemma 13.231 there is a sequence (S n ) of positive numbers such that 

(36) TB* D Bj n 

for all n > 1. Without loss of generality, assume that 6 n — > as n — > oo. 

Let y be any point in Bj . By (|36[) with n — there is a point xo G B* with 
\\y — Txq\\ < 6\. Since (y — Txq) G Bj i , (1361) with n — 1 implies that there exists a 
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point x\ € B* such that \\y — Txo — Tx\\\ < 62- Continuing, we obtain a sequence 
(x n ) such that x n £ for all n, and 



(37) 



n+l- 



V - T I ^ x k 

\fc=o / 

Since ||x„|| < e„, the series J2n x n 1S absolutely convergent, so by Lemma 1231 it is 
convergent; write x = J2 n x n- Then 



Nil < \\x n \\ <},e n < 2e - 

n=0 n=0 



The map T is continuous, so (|37[) shows that y = Tx since 5„ — > 0. 

That is, for any y £ Bj o we have found a point a; E ^2e such that Ta; = y, 
proving (1351) . □ 



Lemma 3.25. For any open set G C X and for any point y = Tx, x £ G, there 
is an open ball B^ such that y + B^ C T{G). 

Notice that Lemma 13.251 proves Theorem 13.221 since it implies that T(G) is 
open. 

Proof. Since G is open, there is a ball such that x + Bf C G. By Lemma 
1331 T{B?) D B^ for some n > 0. Hence 

T(G) D T{x + B* ) = T(x) + T{B?) Dy + B* . 

□ 

As an application of Theorem 13.221 we establish a general property of inverse 
maps. Generalizing Definition 13.31 slightly, we have the following. 

Definition 3.26. Let T : X — y Y be an injective linear operator. Define the 
inverse of T, T -1 by requiring that 

T~ 1 y = x if and only if Tx = y. 

Then the domain of T _1 is a linear subspace of Y, and T -1 is a linear operator. 

It is easy to check that T~ 1 Tx = x for all x £ X, and TT~ 1 y = y for all y in 
the domain of T _1 . 

Lemma 3.27. Let X and Y be Banach spaces, and let T be an injective bounded 
linear map from X to Y . Then T _1 is a bounded linear map. 

Proof. Since T _1 is a linear operator, we only need to show it is continuous 
by Theorem 13.71 By Theorem 13.221 (T^ 1 )^ 1 maps open sets onto open sets. By 
Exercise 1 1.24[ 1]. this means that T _1 is continuous. □ 

Corollary 3.28. If X is a Banach space with respect to two norms \\ ■ \\^ 
and || • || ( 2 ) and there is a constant K such that 

IMI (1) < K\\x\\ {2 \ 

then the two norms are equivalent: there is another constant K' with 

N| (2) < K'\\x\\^ 

for all x £ X . 
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PROOF. Consider the map r : i 4 i from (X, || • ||W) to (X,\\ • By 
assumption, T is bounded, so by Lemma 13.271 T -1 is also bounded, giving the 
bound in the other direction. □ 

Definition 3.29. Let T : X —> Y be a linear operator from a normed linear 
space X into a normed linear space Y, with domain Dt- The graph of T is the set 

G T = {(x, Tx) | x G D T } C X x Y. 

If Gt is a closed set in X xY (see Example II . 15[) then T is a closed operator. 

Notice as usual that this notion becomes trivial in finite dimensions: if X and 
Y are finite-dimensional, then the graph of T is simply some linear subspace, which 
is automatically closed. The next theorem is called the closed-graph theorem. 

Theorem 3.30. Let X and Y be Banach spaces, and T : X — > Y a linear 
operator (notice that the notation means Dt = X). If T is closed, then it is 
continuous. 

PROOF. Fix the norm ||(x, y)|| — \\x\\x + \\u\\y on X x Y. The graph Gt is, 
by linearity of T, a closed linear subspace inlxy, so Gt is itself a Banach space. 
Consider the projection P : Gt — > X defined by P(x, Tx) = x. Then P is clearly 
bounded, linear, and bijective. It follows by Lemma [3.271 that P^ 1 is a bounded 
linear operator from X into Gt, so 

||(x,Tx)|| = ||P _1 x|| < K\\x\\ x for all x G A, 

for some constant K. It follows that ||x||x + II^IIy < -K"||x||x for all x G X, so T 
is bounded - and therefore T is continuous by Theorem 13.71 □ 

7. Hahn Banach theorem 

Let X be a normed linear space. A bounded linear operator from X into the 
normed space 1 is a (real) continuous linear functional on X. The space of all 
continuous linear functionals is denoted B(X,M.) = X* , and it is called the dua] or 
conjugate space of X. All the material here may be done again with C instead of 
R without significant changes. 

Notice that Lemma 13.81 shows that X* is itself a Banach space independently 
of X. 

One of the most important questions one may ask of X* is the following: are 
there "enough" elements in X*l (to do what we need: for example, to separate 
points). This is answered in great generality using the Hahn-Banach theorem 
fTheorem l3.32l below'): see Corollarv l3.35l First we prove the Hahn-Banach lemma. 

Lemma 3.31. Let X be a real linear space, andp :I-}Ia continuous function 

with 

p(x + y) < p{x) +p(y), p(Ax) = Xp(x) for all X > 0, x, y € X. 
Let Y be a subspace of X, and f G Y* with 

f(x) < p(x) for all y G Y. 
Then there exists a functional F G X* such that 

F(x) = f(x) for x G Y; F(x) < p(x) for all x G X. 
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Proof. Let JC be the set of all pairs (Y a ,g a ) in which Y a is a linear subspace 
of X containing Y, and g a is a real linear functional on Y a with the properties that 

9a{x) — f( x ) for all x € Y, g a (x) < p(a;) for all x £ Y Q . 

Make /C into a partially ordered set by defining the relation (Y a ,g a ) < (Yp,g@) if 
Y a C Ya and g a — gp on Y a . It is clear that any totally ordered subset {(Y\,g\) 
has an upper bound given by the subspace 1J A Y\ and the functional defined to be 
g\ on each Y\. 

By Theorem I A. 41 there is a maximal element (Yq, go) in IC. All that remains is 
to check that Yq is all of X (so we may take F to be go). 

Assume that y\ € X\Y . Let Yi be the linear space spanned by Y and yy. 
each element x € Y\ may be expressed uniquely in the form 

x = y + Xy u yeY ,\eR, 

because y\ is assumed not to be in the linear space Yq. Define a linear functional 
gi E Yi* by 51 (y + \ yi ) = g (y) + Ac. 

Now we choose the constant c carefully. Note that if x 7^ y are in Yq , then 

50(2/) - 9o(x) = g (y - x) < p{y - x) < p(y + y%)+p{-yi - x), 

so 

-p(-Vi -x)- g (x) < p(y + yi) - g (y). 

It follows that 

A= sup{-p(-yi -x) - g (x)} < M{p(y + y 1 )-gQ(y)} = B. 
Choose c to be any number in the interval [A, B]. Then by construction of A and 

(38) c < p{y + yi ) - g (y) for all y e Y Q , 

(39) - p(-yi -y)- go{y) < c for all y e Y . 
Multiply ([3^)1 by A > and substitute ¥ for y to obtain 

(40) \c<p{y + \ yi )-g {y). 

Now multiply (f39l) by A < 0, substitute j for y and use the homegeneity assumption 
on p to obtain (|4Tfl) again. Since (fl(Jl) is clear for A = 0, we deduce that 

9i(y + Ayi) = .90 (y) + Ac < p(y + Ayr) 

for all A £ R and y G Y . That is, (Yi, gi) € £ and (Y , g ) < (Yi.ffi) with Y 5^ Y x . 
This contradicts the maximality of (Yq,<?o)- D 

For real linear spaces, the Hahn-Banach theorem follows at once (for complex 
spaces a little more work is needed) . 

Theorem 3.32. Let X be a real normed space, and Y a linear subspace. Then 
for any y* £ Y* there corresponds an x* € X* such that 

\\ x *\\ = \\y*\\, an d x*(y)=y*(y) for all y E Y. 

That is, any linear functional defined on a subspace may be extended to a linear 
functional on the whole space with the same norm. 
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PROOF. Let p(x) = ||y*||||x||, /(x) = y*{x), and x* = F. Apply the Hahn- 
Banach Lemma I33T1 To check that ||x*| < ||y||, write x*(x) = 6\x*(x)\ for 6 = ±1. 
Then 

|x*(x)| = 0x*(x) = x*{0x) < p(9x) = \\y*\\\\6x\\ = \\y*\\\\x\\. 
The reverse inequality is clear, so ||x*|| = ||y*||. □ 

Many useful results follow from the Hahn-Banach theorem. 

Corollary 3.33. Let Y be a linear subspace of a normed linear space X , and 
let xo G X have the property that 

(41) inf -soil =d>0. 

Then there exists a point x* G X* such that 

x*(x ) = 1, || = i, x*(y) = for all y G Y . 

Proof. Let Y\ be the linear space spanned by Y and xq. Since xq ^ Y, every 
point x in Y\ may be represented uniquely in the form x = y + Axo, with y G Y, 
A £ K. Define a linear functional z* G Y* by z*(y + Axo) = A. If A ^ 0, then 

\\y + \x Q \\ = |A| \\j+xa\\ > |A|d. 

It follows that |z*(x)| < |jx||/d for all x G Y\, so ||z*|| < 4. Choose a sequence 
(y n ) C Y with ||xo — y n \\ — > d as n — > oo. Then 

1 = z*(x - y n ) < ||^*||||^o - Vn\\ \\z*\\d, 
so ||z*|| = i. Apply Theorem to z*. □ 

Corollary 3.34. Lei A 6e a normed linear space. Then, for any x ^ in X 
there is a functional x* G A* with \\x*\\ — 1 and x* (x) = \\x\\. 

Proof. Apply Corollary [37331 with Y = {0} to find z* = X* such that ||z*|| = 
l/||x||, z*(x) = 1. We may therefore take x* to be ||x||z*. □ 

Corollary 3.35. If z ^ y in a normed linear space X , then there exists 
x* G X* such that x* (y) ^ x*(z). 

PROOF. Apply Corollary 13. 341 with x = y - z. □ 

Corollary 3.36. If X is a normed linear space, then 

\\x\\ = sup = sup |x*(x)|. 

**5*o if ii iix*n=i 

Proof. The last two expressions are clearly equal. It is also clear that 

sup |x*(x)| < ||x||. 

||a*||=l 

By Corollary 13. 341 there exists Xq such that x (x) = ||x|| and ||x || = 1, so 

sup |x*(x)| > ||x||. 

||x*||=l 

□ 
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Corollary 3.37. Let Y be a linear subspace of the normed linear space X . If 
Y is not dense in X , then there exists a functional x* ^ such that x*{y) = for 
all y eY . 

Proof. Notice that if there is no point Xq € X satisfying (HIT) then Y must be 
dense in X. So we may choose xo with pTj) and apply Corollary 13. 33[) . □ 

Notice finally that linear functionals allow us to decompose a linear space: let 
X be a normed linear space, and x* € X* . The null space or kernel of x* is the 
linear subspace N x -* — {x £ X \ x*(x) = 0}. If x* ^ 0, then there is a point 
xq =/= such that x* (xq) = 1. Any element x £ X can then be written x — z + \x§, 
with A = x* (x) and z = x — \x a e N x * . Thus, X = N x * © F, where Y is the 
one-dimensional space spanned by xq. 
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Integration 

We have seen in Examples II. 2 1( 3] that the space C[0, 1] of continuous functions 
with the p-norm 

\\f\\ P =U\f(t)\ p dt 

is not complete, even if we extend the space to Riemann-integrable functions. 

As discussed in the section on completions, we can think of the completion of 
the space in terms of all limit points of (equivalence classes) of Cauchy sequences. 
This does not give any real sense of what kind of functions are in the completion. In 
this chapter we construct the completions L p for 1 < p < oo by describing (without 
proofs) the Lebesgu^ integral. 

1. Lebesgue measure 

Definition 4.1. Let B denote the smallest collection of subsets of R that in- 
cludes all the open sets and is closed under countable unions, countable intersections 
and complements. These sets are called the Borel sets. 

In fact the Borel sets form a a~algebra: R, G B, and B is closed under 
countable unions and intersections. We will call Borel sets measurable. Many 
subsets of R are not measurable, but all the ones you can write down or that might 
arise in a practical setting are measurable. 

The Lebesgue measure on R is a map \i : B — > R U {oo} with the properties 

that 

(i) /i[a, b] — /i(a, b) = b — a; 

(ii) / i(u- 1 A,) = E^=iM(A,). 

Notice that the Lebesgue measure attaches a measure to all measurable sets. 
Sets of measure zero are called null sets, and something that happens everywhere 
except on a set of measure zero is said to happen almost everywhere, often written 
simply a.e. For technical reasons, allow any subset of a null set to also be regarded 
as "measurable" , with measure zero. 



1 Henri Leon Lebesgue (1875-1941), was a French mathematician who revolutionized the field 
of integration by his generalization of the Riemann integral. Up to the end of the 19th century, 
mathematical analysis was limited to continuous functions, based largely on the Riemann method 
of integration. Building on the work of others, including that of the French mathematicians Emilc 
Borel and Camille Jordan, Lebesgue developed (in 1901) his theory of measure. A year later, 
Lebesgue extended the usefulness of the definite integral by defining the Lebesgue integral: a 
method of extending the concept of area below a curve to include many discontinuous functions. 
Lebesgue served on the faculty of several French universities. He made major contributions in 
other areas of mathematics, including topology, potential theory, and Fourier analysis. 
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Exercise 4.2. [1] Prove that fi(Q) = 0. Thus a.e. real number is irrational. 
[2] More can be said: call a real number algebraic if it is a zero of some polynomial 
with rational coefficients, and transcendental if not. Then a.e. real number is 
transcendental. 

[3] Prove that for any measurable sets A, B, fj,(A U B) = fi(A) + fj,(B) - fj,(A n B). 
[4] Can you construct a set that is not a member of £>? 

Definition 4.3. A function / : K — > K U {±00} is a Lebesgue measurable 
function if f~ 1 (A) € B for every A £ A. 

Example 4.4. [1] The characteristic function XQi defined by Xq( x ) = 1 if 
x e Q, and = if x (fc Q is an example of a measurable function that is not 
Riemann integrable. 

[2] All continuous functions are measurable (by Exercise 11.24^ 1]). 

The basic idea in Riemann integration is to approximate functions by step 
functions, whose "integrals" are easy to find. These give the upper and lower 
estimates. In the Lebesgue theory, we do something similar, using simple functions 
instead of step functions. 

A simple function is a map / : K — > K of the form 

n 

(42) f(x)=J2dXE i (x), 

z=l 

where the Cj are non-zero constants and the Ei are disjoint measurable sets with 
n{Ei) < 00. 

The integral of the simple function (|42l) is defined to be 

~ n 

/ fdn = y2 Cl n(EDE t ) 

J E i=l 

for any measurable set E. 

The basic approximation fact in the Lebesgue integral is the following: if / : 
R — > MU{±oo} is measurable and non-negative, then there is an increasing sequence 
(/„) of simple functions with the property that f n (t) — > f(t) a.e. We write this as 
frit f a - e -j an d define the integral of / to be 

/ fdfx = lim / f n dfi. 

Je n ^°°JE 



This "construction" requires the use of the Axiom of Choice and is closely related to the 
existence of a Hamel basis for M as a vector space over Q. The question really has two faces: 
1) using the usual axioms of set theory (including the Axiom of Choice), can you exhibit a non- 
measurable subset of K? 2) using the usual axioms of set theory without the Axiom of Choice, is 
it still possible to exhibit a non-measurable subset of K? 

The first question is easily answered. The second question is much deeper because the answer 
is "no". This is part of a subject called Model Theory. Solovay showed that there is a model of 
set theory (excluding the Axiom of Choice but including a further axiom) in which every subset 
of R is measurable. Shelah tried to remove Solovay's additional axiom, and answered a related 
question by exhibiting a model of set theory (excluding the Axiom of Choice but otherwise as 
usual) in which every subset of M has the Baire property. The references are R.M. Solovay, "A 
model of set— theory in which every set of reals is Lebesgue measurable", Annals of Math. 92 
(1970), 1-56, and S. Shelah, "Can you take Solovay's inaccessible away?", Israel Journal of Math. 
48 (1984), 1-47 but both of them require extensive additional background to read. 
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Notice that (once we allow the value oo), the limit is guaranteed to exist since the 
sequence is increasing. 

For a general measurable function /, write / = / + — / _ where both / + and 
f~ are non-negative and measurable, then define 



fdpi = \ yd[i - f d\x. 

J E J E 

Example 4.5. Let f{x) = XQn[o,i]( a -)- Then / is itself a simple function, so 

/d/i = /i(Qn[o,i]) = o. 



A measurable function / on [a, b] is essentially bounded if there is a constant 
K such that |/(x)| < K a.e. on [a, b]. The essential supremum of such a function 
is the infimum of all such essential bounds K, written 

WfWoo =ess.sup. [(t)5] |/|. 

Definition 4.6. Define C p [a, b] to be the linear space of measurable functions 
/ on [a. b] for which 

II/IIp= (/Vl p d^ <o° 

for p G [1, oo) and £oo[a, b] to be the linear space of essentially bounded functions. 
Notice that |j • || p on C p is only a semi-norm, since many functions will for example 
have || = 0. Define an equivalence relation on C p by / ~ g if {x € M | f(x) ^ 
g(x)} is a null set. Then define 

L p [a,b] = C p / ~, 

the space of L p functions. 

In practice we will not think of elements of L p as equivalence classes of functions, 
but as functions defined a.e. A similar definition may be made of p-integrable 
functions on R, giving the linear space L p (R). 

The following theorems are proved in any book on measure theory or modern 
analysis or may be found in any of the references. Theorem 14. 71 is sometimes called 
the Riesz-Fischer theorem; Theorem 14.81 is Holder's inequality. 



Theorem 4.7. The normed spaces L p [a,b\ and L p (M.) are (separable) Banach 
spaces under the norm \\ ■ \\ p . 

Theorem 4.8. If - = - + -, then 

" t p q 7 

ll/flllr < \\fU\g\U 

for any / G L p [a,b], g G L q [a, b]. It follows that for any measurable f on [a, b], 
II /Hi < ||/||a< 11/113 <•••< \\f\U 

Hence 

Li[a,b] D L 2 [a,b] D ■■ ■ D Loo [a, b] . 
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In the theorem we allow p and q to be anything in [l,oo] with the obvious 
interpretation of 

Note the "opposite" behaviour to the sequence spaces £ p in Example 11.11( 3] , 
where we saw that 

ti c £ 2 c • • ■ c 4o- 

Two easy consequences of Holder's inequality are the Cauchy-Schwartz inequal- 
ity, 

II/5II1III/II2NI2 

and Minkowski's inequality, 

11/ + all* < II/IIp + IMIp- 

The most useful general result about Lebesgue integration is Lebesgue's domi- 
nated convergence theorem. 

Theorem 4.9. Let (/„) be a sequence of measurable functions on a measurable 
set E such that f n (t) — > f{t) a.e. and there exists an integrable function g such 
that |/n(i)| < g(t) a.e. Then 

fdfx = lim / f n d[i. 

n-yoo J E 

Exercise 4.10. [1] Prove that the L p -norm is strictly convex for 1 < p < 00 
but is not strictly convex if p = 1 or 00. 

2. Product spaces and Pubini's theorem 

Let X and Y be two subsets of K. Let A, B denote the a-algebra of Borel sets 
in X and Y respectively. 

Subsets of X x Y (Cartesian product) of the form 

A x B — {(x, y) : x £ A, y e B) 

with A G A, B G B are called (measurable) rectangles. Let A x B denote the 
smallest tr-algebra onlxY" containing all the measurable rectangles. Notice that, 
depite the notation, this is much larger than the set of all measurable rectangles. 
The measure space (X x Y,Ax B) is the Cartesian product of (X,A) and {Y,B). 
Let fix and \iy denote Lebesgue measure on X and Y. Then there is a unique 
measure A on X x Y with the property that 

\{A x B) = nx{A) x n Y (B) 

for all measurable rectangles A x B. This measure is called the product measure 
of fix and [iy and we write A = fix X f-Y- 

The most important result on product measures is Fubini's theorem. 

Theorem 4.11. If h is an integrable function on X x Y , then x > h(x,y) is 
an integrable function of X for a.e. y, y 1— > h(x,y) is an integrable function of y 
for a.e. x, and 

J hd{fix x fiy) = J J hd/ixd/ry = J J hd/iyd/ix- 
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Hilbert spaces 

We have seen how useful the property of completeness is in our applications 
of Banach-space methods to certain differential and integral equations. However, 
some obvious ideas for use in differential equations (like Fourier analysis) seem to 
go wrong in the obvious Banach space setting (cf. Theorem l3.20l) . It turns out that 
not all Banach spaces are equally good - there are distinguished ones in which the 
parallelogram law (equation P5| below) holds, and this has enormous consequences. 
It makes more sense in this section to deal with complex linear spaces, so from now 
on assume that the ground field is C. 

1. Hilbert spaces 

Definition 5.1. A complex linear space H is called a Hilbert space if there 
is a complex- valued function (•,•): H x H C with the properties 

(i) (x, x) > 0, and (x, x) = if and only if x = 0; 

(ii) (x + y, z) = {x, z) + (y, z) for all x, y, z £ H; 

(iii) (Xx, y) = X(x, y) for all x,y £ H and A £ C; 

(iv) (x,y) = (y,x) for all x,y £ C; 

(v) the norm defined by ||x|| = {x,x) x / 2 makes H into a Banach space. 

If only properties (i), (ii), (iii), (iv) hold then (H, (•,•)) is called an inner- 
product space. 

Notice that property (v) makes sense since by (i) {x, x) > 0, and we shall see 
below (Lemma 15. 4[) that || • || is indeed a norm. 

The function (•, •) is called an inner or scalar product, and so a Hilbert space 
is a complete inner product space. 

If the scalar product is real-valued on a real linear space, then the properties 
determine a real Hilbert space; all the results below apply to these. 

Notice that (iii) and (iv) imply that (x,Xy) — X(x,y), and (x, 0) = (0, a;) = 0. 

Example 5.2. [1] If X = C™, then (x, y) = J27=i x iVi makes C" into an n- 
dimensional Hilbert space. 

^David Hilbert (1862-1943) was a German mathematician whose work in geometry had the 
greatest influence on the field since Euclid. After making a systematic study of the axioms of 
Euclidean geometry, Hilbert proposed a set of 21 such axioms and analyzed their significance. 
Hilbert received his Ph.D. from the University of Konigsberg and served on its faculty from 1886 
to 1895. He became (1895) professor of mathematics at the University of Gottingen, where he 
remained for the rest of his life. Between 1900 and 1914, many mathematicians from the United 
States and elsewhere who later played an important role in the development of mathematics 
went to Gottingen to study under him. Hilbert contributed to several branches of mathematics, 
including algebraic number theory, functional analysis, mathematical physics, and the calculus of 
variations. He also enumerated 23 unsolved problems of mathematics that he considered worthy 
of further investigation. Since Hilbert's time, nearly all of these problems have been solved. 
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[2] Let X = C[a, b] (complex-valued continuous functions). Then the inner-product 

(fid) — I a f(t)9(t)dt makes X into an inner-product space that is not a Hilbert 
space. 

[3] Let X — I2 (square-summable sequences; see Example 1 1 . 1 1( 3] ) with the inner- 
product ((x n ), (y n )) = S^Li x nVn- This is well-defined by the Schwartz inequality 
Lemma 15.31 and it is a Hilbert space by Example 12.2( 3] . We shall see later that £2 
is the only l p space that is a Hilbert space. 

[4] Let X = L 2 [a, b] with inner-product (f,g) = J° f (t)g{t)dt. Then X is a Hilbert 
space (by the Cauchy-Schwartz inequality and Theorem 14. 7[) . 

Lemma 5.3. In a Hilbert space, 

\(x,y)\ < \\x\\\\y\\. 

Proof. Assume that x,y are non-zero (the result is clear if x or y is zero), 
and let AeC. Then 

< (x + Xy, x + Xy) 

= |M| 2 + |A| 2 ||y|| 2 + A(y,s) + A(a!,y) 

= |M| 2 + |A| 2 ||y|| 2 + 29^,2,)]. 

Let A = —re 10 for some r > 0, and choose 9 such that 9 = — arg(cc, y) if (x, y) 7^ 0. 
Then 

lkl| 2 +r 2 ||y|| 2 >2r|(.T, 2/ )||. 
Take r = ||^||/||y|| to obtain the result. □ 

Lemma 5.4. The function defined by \\x\\ = (x^x) 1 ^ 2 is a norm on a Hilbert 
space. 

Proof. All the properties are clear except the triangle inequality. Since 
(s,I/) + (»,a:)=2R(a : ,i/)<2||i|||| I ,|| J 

we have 

\\x + y\\ 2 = || 2 ;|| 2 + ||y|| 2 + (a ; ,y) + (y,a:) 

< ||x|| 2 + ||jy|| 2 + 2|| a; |||| y |H(||x|| + || y ||) 2 , 
so||i + i/||<||i|| + ||i/||. □ 
Lemma 5.5. The norm on a Hilbert space is strictly convex (cf. Definition 



Proof. From the proof of Lemma [5.31 if y)| = ||x||||j/||, then x = —Xy. 
From the proof of Lemma I5T41 it follows that if ||x|| + \\y\\ = \\x + y\\ and y 7^ then 
x = —Xy. Hence if ||x|| = ||y|| = 1 and \\x + y\\ = 2, then |A| = 1 and |1 — A| = 2, 
so A = — 1 and x = y. □ 

Next there is the peculiar parallelogram law. 

Theorem 5.6. If H is a Hilbert space, then 
(43) || 2 ; + j;|| 2 + || a; -j / || 2 = 2|| a ;|| 2 + 2||j / || 2 

for all x,y G H . 

Conversely, if H is a complex Banach space with norm || • | satisfying {4 Sty , 
then H is a Hilbert space with scalar product (•, •) satsifying \\x\\ = (x, x) 1 ^ 2 . 
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Proof. The forward direction is easy: simply expand the expression 
(x + y, x + y) + (x - y, x - y). 

For the reverse direction, define 
(44) (x,y) = l([\\x + y\\ 2 -\\x-y\\ 2 ] +i [\\x + ty\\ 2 - \\x - iy\\ 2 }) 

(in the real case, with the second expression simply omitted) . Since 

(x,x) = || a ;|| 2 + ^||.x|| 2 |l + *| 2 -^||x|| 2 |l- J | 2 = || a ;|| 2 , 

the inner-product norm (x, x) 1 / 2 coincides with the norm ||x||. 

To prove that (•,•) satisfies condition (ii) in Definition 15.11 use (14*51) to show 

that 

\\u + v + w\\ 2 + \\u + v - w\\ 2 = 2||ii + u|| 2 + 2j|wj| 2 , 
||u - v + w\\ 2 + \\u - v - w\\ 2 = 2\\u- v\\ 2 + 2\\w\\ 2 . 

It follows that 

(\\u + v - w\\ 2 - \\u - v + w\\ 2 ) + (\\u + v - w\\ 2 - \\u - v - w\\ 2 ) 

= 2||u + w|| 2 -2||m-v|| 2 , 

showing that 

3?(u + w,v)+ di(u -w,v) = 25R(u, v). 
A similar argument shows that 

3(u + w, v) + 9(u — w,v) = 2G(u, v), 

so 

(it + w, v) + (u — w, v) = 2(u, v). 

Taking w = u shows that (2u, v) = 2{u, v). Taking u + w = x, u — w = y 1 v = z 
then gives 

(x, z) + (y, z) = 2 (^-^-, zj =(x + y, z). 

To prove condition (iii) in Definition 15.11 use (ii) to show that 

(mx,y) = ((m-l)x + x,y) = ((m-l)x,y) + (x,y) 
= {(m-2)x,y) + 2(x,y) 

= m(x,y). 

The same argument in reverse shows that n(x/n, y) = (x,y),so (x/n, y) — (l/n)(x,y). 
If r = m/n (m, neN) then 

r{x,y) = —(x,y) = m (-,y) = (—x,y) = (rx,y). 
n \n J \n / 

Now (x, y) is a continuous function in x (by ((44])); we deduce that X(x, y) = (Ax, y) 
for all A > 0. For A < 0, 

Hx,y)-(Xx,y) = X(x,y)-(\X\(-x),y) = X(x,y)-\X\(-x,y) 
= X(x,y)+X(-x,y) = X(Q,y) = 0, 
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so (iii) holds for all A 6 K. For A = i, (iii) is clear, so if A = ll + iv, 

Hx,y) = n(x,y) + iv(x,y) = (px,y)+i(vx,y) 

= {^x,y) + (wx,y) = (Xx,y). 

Condition (iv) is clear, and (v) follows from the assumption that H is Banach 
space. □ 

2. Projection theorem 

Let H be a Hilbert space. A point x £ H is orthogonal to a point y £ H, 
written x J_ y, if (x,y) = 0. For sets N,M in H, x is orthogonal to N, written 
x _L N, if (x, y) = for all y £ N. The sets iV and M are orthogonal (written 
A _L M) if a; _L M for all x £ N. The orthogonal complement of M is defined as 

M 1 " = {.t £ H | x _L M}. 

Notice that for any M, M 1 - is a closed linear subspace of H . 

Lemma 5.7. Let M be a closed convex set in a Hilbert space H . For every point 
xq £ H there is a unique point y$ £ M such that 



(45) 



Iko - 2/o | 



inf \\xo 



y\\ 



That is, it makes sense in a Hilbert space to talk about the point in a closed 
convex set that is "closest" to a given point. 

Proof. Let d — inf yS M H^o — y\\ and choose a sequence (y n ) in M such that 
1 1 rco — j/n. 1 1 -> as oo. By the parallelogram law (|43l) . 

4||a;o-5(ym + yn)|| 2 + ||y-m-y«|| 2 = 2||x - y m \\ 2 + 2\\x ■ »• " 2 



2/n 



Ad 2 



asm,n-> oo. By convexity (Definition [L9j) ) §(2/m + 2/n) £ M, so 

4||x - i( 2/m + 2 /„)|| 2 >4o! 2 . 

It follows that \\y m — y n \\ —> as m, n — > oo. Now if is complete and M is a closed 
subset, so Hm„_j. 00 y n = 2/0 exists and lies in M. Now ||xo — 2/o|| = Iim n _ J . 00 ||xo — 
y n \\ =d, showing (g5]). 

It remains to check that the point 2/0 is the only point with property (|45[) . Let 
j/i be another point in M with 



Then 



x 



yo + yi 



\x - 2/i 1 1 = inf || xq - y|| 

y£M 



< IN - 2/0 1| + ll^o - 2/i 1 1 



< 2 inf ||x - 2/11 < 2 



Xq 



2/0 + 2/1 



since (2/0 + 2/1 )/2 lies in M. It follows that 

2/0 + 2/1 



£0 



= ll^o - 2/o 1 1 + ll-^o - 2/i||- 



Since the Hilbert norm is strictly convex (Lemma 15.51) . we deduce that xq — yo — 

X Q - J/1, so 2/1 = 2/0- □ 
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This gives us the Orthogonal Projection Theorem. 

Theorem 5.8. Let M be a closed linear subspace of a Hilbert space H . Then 
any xo G H can be written xo = yo + zq, yo G M, zo G M . The elements yo, zo 
are determined uniquely by xo . 

Proof. If xq G M then yo = and zq = 0. If xq ^ M, then let yo be the 
point in M with 

||zo - 2/o 1 1 = inf ||x - 
(this point exists by Lemma f5 . T[) . Now for any y G M and A G C, yo + Ay G M so 

\\x ~yo\\ 2 < Hxo-2/o-Ayll 2 = \\x - y \\ 2 - 25RA(y, x - yo) + |A| 2 ||2/|| 2 . 
Hence 

-2^X(y,xo-yo) + \X\ 2 \\y\\ 2 >0. 
Assume now that A = e > and divide by e. As e — > we deduce that 

(46) M(y,x -y )<0. 
Assume next that A = —it and divide by e. As e — > 0, we get 

(47) 3(2/,*o-2/o)<0. 

Exactly the same argument may be applied to — y since — y G M, showing that 
(US]) and (H71) hold with y replaced by — y. Thus (y, x — yo) — for all y G M. It 
follows that the point zq — xq — yo lies in M . 

Finally, we check that the decomposition is unique. Suppose that xo — yi + Z\ 
with yx G M and z\ G . Then yo - yi = z 1 - z G M n Af 1 - = {0}. □ 

Corollary 5.9. I/Af is a closed linear subspace and M ^ H , then there exists 
an element zq 7^ such that zq -L M . 

Proof. Apply the projection theorem (Theorem 15. 8p to any xo G H\M . □ 

It follows that all linear functionals on a Hilbert space are given by taking inner 
products - the Riesz theorem. 

Theorem 5.10. For every bounded linear functional x* on a Hilbert space H 
there exists a unique element z G H such that x*(x) = {x,z) for all x € H. The 
norm of the functional is given by \\x*\\ = \\z\\. 

Proof. Let A be the null space of x*; A is a closed linear subspace of H. If 
N = H, then x* = and we may take x*{x) — (x, 0). If N ^ H, then by Corollary 
15.91 there is a point z G A- 1 , z ^ 0. By construction, a = x*(z n ) ^ 0. For any 
x G H, the point x — x* (x)zq/oc lies in A, so 

(x — x* (x)zo/a, zo) = 0. 

It follows that 

x*(x) (^> z o) = (x,Zq). 

If we substitute z = ( Z(] a Zo ) Zo, we get x*{x) — (x, z) for all x G H . 

To check uniqueness, assume that x*(x) — {x,z') for all x G H. Then (x,z — 
z') = for all x G H, so (taking x — z — z'), \\z — z'\\ =0 and therefore z = z' . 

Finally, 

||se*|| = sup |x*(a;)| = sup \(x, z)\\ < sup (||x|| \\z\\) = \\x\\. 

\\x\\ = l IMI = 1 IM| = 1 
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On the other hand, 

\\z\\ 2 = (z,z) = \x*(z)\<\\x*\\\\z\\, 
so ||z|| < ||x*||. □ 

COROLLARY 5.11. If H is a Hilbert space, then the space H* is also a Hilbert 
space. The map a : H — > H* given by (ax)(y) = {y,x) is an isometric embedding 
ofH onto H*. 

Definition 5.12. Let M and N be linear subspaces of a Hilbert space H. If 
every element in the linear space M + N has a unique representation in the form 
M, y £ N, then we say M + N is a direct sum. If M _L N, then we write 
M © N - and this sum is automatically a direct one. If Y = M © N, then we also 
write N = Y M and call N the orthogonal complement of M in F. 

Notice that the projection theorem says that if M is a closed linear space in 
H, then H = M © M L . 

3. Projection and self adjoint operators 

Definition 5.13. Let M be a closed linear subspace of the Hilbert space H. By 
the projection theorem, every x £ H can be written uniquely in the form x = y + z 
with y e M , z e A'/- 1 . Call y the projection of x in M, and the operator P = Pm 
defined by Px — y is the projection on M. The space M is called the subspace of 
the projection P. 

Definition 5.14. Let T : H — > H be a bounded linear operator. The adjoint 
T* of T is defined by the relation (Tx, y) = (x, T*y) for all x,y E H. An operator 
with T = T* is called self-adjoint. 

Notice that if T is self-adjoint, then for every x £ H, (Tx, x) £ K. 

Exercise 5.15. Let T and S be bounded linear operators in Hilbert space H, 
and A £ C. Prove the following: (T + S)* = T*+S*; (TS)* = S*T*; (AT)* = AT*; 
I* =7; T** =T; ||T*j| = ||T||. If T" 1 is also a bounded linear operator with domain 
H, then (T*)" 1 is a bounded linear map with domain H and (T^ 1 )* = (T*) _1 . 

Theorem 5.16. [1] If P is a projection, then P is self-adjoint, P 2 = P and 
\\P\\ = 1 ifP^O. 

[2] If P is a self-adjoint operator with P 2 = P , then P is a projection. 

1. Let P = Pm, and x, = j/j + Zi for i — 1,2 where yi £ M and 2j e M^. Then 
AiXi + A 2 .x 2 = (Aij/i + A 2 ?y 2 ) + (Ai^i + A 2 z 2 ) and 

(Aii/i + A 2 y 2 ) G M, (Aizi + A 2 z 2 ) £ M^. 

It follows that P is linear. To see that P 2 = P, notice that P 2 xi = P(Pxi) = 
P(tfi) = J/i = Psi since yi e M. Notice that ||xi|| 2 = || yi || 2 + ||zi|| 2 > ||yi|| 2 = 
||Pxi|| 2 so ||P|| < 1. If P ^ then for any x £ M\{0} we have Px = x so ||P|| > 1. 
Self-adjointness is clear: 

(Pxi,x 2 ) = (yi,x 2 ) = (2/1,2/2) = (xi,y 2 ) = {x 1 ,Px 2 ). 

[2] Let M — P(H); then M is a linear subspace of H . If y n = P(x n ), with y n — > z, 
then Py„ = P 2 x„ = Px n = Py n , so z — lim„ y„ = lim„ Py n = Pz £ M so M is 
closed. Since P is self-adjoint and P 2 = P, 

(x - Px, Py) = (Px - P 2 x, y)=0 
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for all y £ H so x — Px £ M 1 - . This means that x = Px + (x — Px) is the unique 
decomposition of x as a sum y + z with y £ M and z £ M^. That is, P is the 



We collect all the elementary properties of projections into the next theorem. 
Projections Pi and Pi are orthogonal if P1P2 = 0. Since projections are self- 
adjoint, PxP 2 = if and only if P 2 Pi = 0. 

The projection Pl is part of the projection Pm if and only if L C M. 

Theorem 5.17. [1] Projections Pm and Pn are orthogonal if and only if 
M _L N. 

[2] The sum of two projections Pm and Pn is a projection if and only if PmPn = 0. 
In that case, P M + Pn = Pm®n ■ 

[3] The product of two projections Pm and Pn is another projection if and only if 
PmPn = PnPm- In that case, PmPn = Pmhn- 

[4] P L is part of P M P M P L = P L <=^ P L Pm = Pl ||Plz|| < 

||P M z|| V x £ H. 

[5] If P is a projection, then I — P is a projection. 

[6] More generally, P — Pm — Pl is a projection if and only if Pl is a part of Pm ■ 
If so, then P = Pmql- 

1. Let P m Pn = and x £ M,y £ N. Then 

(x, y) = (Pmx, P N y) = (PnPmx, y) = 0, 

so M _L N. Conversely, if M _L N then for any x £ H, P N x _L M so P m {Pnx) = 0. 
[2] If P = P M + Pn is a projection, then P 2 = P, so PmPn + PnPm = 0. Hence 

PmPn + PmPnPm - 0, 

after multiplying by Pm on the left. Multiplying on the right by Pm then gives 
2P m PnPm - so P m Pn = 0. 

Conversely, if PmPn = then PnPm = also, so P 2 = P. Since P is self- 
adjoint, it is a projection. 

Finally, it is clear that (P M + Pn)(H) = M ® N so P = P m <$n- 
[3] If P = P m Pn is a projection, then P* = P, so P M Pv = (PmPn)* = P n p m = 
PnPm- 

Conversely, let PmPn = PnPm = P- Then P* = P, so P is self-adjoint. 
Also P 2 = PmPnPmPn = PmPn = PmPn = P, so P is a projection. Moreover, 
Px = Pm(Pnx) = Pn(Pmx) so Px £ M n iV. On the other hand, if a; G M n N 
then Px = Pm(Pnx) = Pmx = x so P = Pmhn- 

[4] Assume that P L is part of P M , so X C M. Then P L x £ M for all x £ H. Hence 
PmPlx = P L x, and P M Pt, = Pl- 
If PmPl =Pl, then 



projection Pm- 



□ 



L — i^L — \"m"L) — r L F M — ±L±M > 



so P L P M = Pl. 
If PlPm = Pi 



'l, then for any x £ H, 



\\P L x\\ = \\P L P M x\\ < \\P L \\\\P M x\\ < \\P m x 



so that ||P L x|| < ||P M x||. 



54 



5. HILBERT SPACES 



Finally, assume that ||Plx|| < ||Pifx||. If there is a point xq E L\M then let 
xo = J/o + z a, Ho S M, zq -L M, and zq ^ 0. Then 

II^lxoII 2 = ||y || 2 + \\z \\ 2 > \\yof = \\Pmx q \\ 2 , 

so there can be no such point. It follows that L C M so Pl is a part of Pm- 
[5] I-P is self-adjoint, and (/ - P) 2 = I- P- P + P 2 =I-P. 
[6] If P is a projection, then by [5] so is /- P = (I-P m ) + Pl- Also by [5], I - P M 
is a projection, so by [2] we must have (I — Pm)Pl = 0. That is, Pl = PmPl- 
Hence, by [4], Pl is a part of Pm- 

Conversely, if Pl is part of Pm , then Pm — Pl and Pl are orthogonal. By [2] , 
the subspace Y of Pm — Pl must therefore satisfy = M, so Y — M Q L. □ 



4. Orthonormal sets 

A subset K in a Hilbert space H is orthonormal if each element of X has norm 
1, and if any two elements of K are orthogonal. An orthonormal set K is complete 
if = 0. 

Theorem 5.18. Let {x n } be an orthonormal sequence in H. Then for any 
x G H. 



(48) 



J2\(x,x n )\ 2 <\\x\\'< 



The inequality (|48|) is Bessel's inequality. The scalar coefficients (a;, x n ) arc 
called the Fourier coefficients of x with respect to {x„}. 



Proof. We have 



^ ^ (x, x n )x n 



x, ^ ^ (x,x n )x n J 

v n=l / 
' m \ m 

^ ^ (x, X n )x n , X J + ^ ^ (x, X n ) (x n , x) , 



\n=l 



n=l 



(49) 

It follows that 



x ^ ^ (x, x n )x n 



n=l 



n=l 



Ek^^)i 2 <i^ii 2 ' 



and Bessel's inequality follows by taking m — > oo. 



□ 



The next result shows that the Fourier series of Theorem 15.181 is the best pos- 
sible approximation of fixed length. 

Theorem 5.19. Let {x ra } be an orthonormal sequence in a Hilbert space H and 
let {X n } be any sequence of scalars. Then, for any n > 1, 



x ^ ^ A n x n 



> 



x ^ ^ (x, x n )x n 



71=1 



n=l 
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PROOF. Write c n = (x,x n ). Then 

2 



X ^ ^nXn 
n=l 


= INI 2 ~ E ^ wC " 

n=l 


~E* 




m 

= NP-Ew 2 - 


m 

^Ei c 

?! = 1 




m 





rn 

E 

n=l 
2 



Now apply equation ([35]). 



□ 



Theorem 5.20. Lei {x„} 6e an orthonormal sequence in a Hilbert space H, 
and let {a n } be any sequence of scalars. Then the series ^2a n x n is convergent if 
and only if ^2 |a„| 2 < 00, and if so 

(50) 



= Ew 



Moreover, the sum ct: n a: n is independent of the order in which the terms are 
arranged. 



Proof. For m > n we have (by orthonormality) 



(51) 



E< 



Since H is complete, ([51]) shows the first part of the theorem. Take n = 1 and 
m — y 00 in ([ST) to get ([50)1 

Assume that X) l a il 2 < 00 an d l e t z = ^2 a j n x j n be a rearrangement of the 
series x — ^2 a j x j- Then 

(52) ||a; — z\\ 2 — (x, x) + (z, z) — {x, z) — (z, x), 

and (x, x) = (z,z) = X)l a j| 2 - Write 

m m 



Then 



(x,z) = lim(s m ,t m ) = V" \aj\ 2 . 

rn * — • 

Also, (z,x) = (x, z) = (x,z) so (l52|) shows that ||x — z\\ 2 = and hence x = z. □ 

Theorem 5.21. Let K be any orthonormal set in a Hilbert space H, and for 
each x £ H let K x = {y \ y G K, (x, y) ^ 0}. Then: 

(i) for any x € H , K x is countable; 

(ii) the sum Ex = ^2 yeK (x,y)y converges independently of the order in which 
the terms are arranged; 

(iii) E is the projection operator onto the closed linear space spanned by K . 
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Proof. From Bessel's inequality (|48p. for any e > there are no more than 
||cc|| 2 /e 2 points y in K with |(a;, > e. Taking e = we see that K x is 

countable for any x. 

Bessel's inequality and Theorem 15.201 show (ii). 

Let < K > denote the closed linear subspace spanned by if. If x _L < K > 
then Ex — 0. If x € < K > then for any e > there are scalars Ai, . . . , X n and 
elements y\ , . . . , y n G K such that 



Then, by Theorem EH 
(53) 



< e. 



< e. 



Without loss of generality, all of the yj lie in K x . Arrange the set K x in a sequence 
{yj}- From d^HJ) notice that the left-hand side of ([53)1 does not increase with n. 



Taking n — > oo, we get 



Ex\\ < e. Since e > is arbitrary, we deduce that 



Ex = x for all x £ < K >. This proves that E = P < k > - 



□ 



Definition 5.22. A set K is an orthonormal basis of H is K is orthonormal 
and for every x € H, 

(54) x = 

Theorem 5.23. Lei 6e an orthonormal set in a Hilbert space H . Then the 
following properties are equivalent. 

(i) K is complete; 

(ii) <K > = H; 

(hi) K is an orthonormal basis for H ; 

(iv) for any x G H, \\x\\ 2 = J2 y£ K x \( X ,V)\ 2 - 

The equality in (iv) is called ParsevaVs formula. 

Proof. That (i) implies (ii) follows from Corollary 15.91 Assume (ii). Then 
by Theorem 15.211 Ex = x for all x E H , so K is an orthonormal basis. Now 
assume (iii). Arrange the elements of K x in a sequence {x n }, and take n — > oo 
in (|4"9")l to obtain Parseval's formula (iv). Finally, assume (iv). If x _L K , then 
||x|| 2 = \( x iy)\ 2 = 0, so a; = 0. This means that (iv) implies (i). □ 

Theorem 5.24. Every Hilbert space has an orthonormal basis. Any orthonor- 
mal basis in a separable Hilbert space is countable. 

Example 5.25. Classical Fourier analysis comes about using the orthonormal 
basis {e 2mnt } nez for L 2 [0,1]. 

Proof. Let H be a Hilbert space, and consider the classes of orthonormal 
sets in H with the partial order of inclusion. By Lemma IA.4I there exists a max- 
imal orthonormal set K. Since K is maximal, it is complete and is therefore an 
orthonormal basis. 



5. GRAM-SCHMIDT ORTHO-NORMALIZATION 



57 



Now let H be separable, and suppose that {x a } is an uncountable orthonormal 
basis. Since, for any a ^= /3, 

\\x a - 37,3 11 2 = ||x Q || 2 + \\xp\\ 2 = 2, 

the balls Bi/ 2 {x a are mutually disjoint. If {y n } is a dense sequence in H, then 
there is a ball B 1 / 2 {x ao ) that does not contain any of the points y n . Hence x ao is 
not in the closure of {j/ n }, a contradiction. □ 

Corollary 5.26. Any two infinite-dimensional separable Hilbert spaces are 
isometrically isomorphic. 

Proof. Let Hi and H 2 be two such spaces. By Theorem 15.241 there are se- 
quences {x n } and {y n } that form orthonormal bases for H\ and H 2 respectively. 
Given any points x € Hi and y € H 2 , we may write 

oo oo 

(55) x — S ' c n x n , y = ^ ' d n x n , 

n—1 n—1 

where c n = (x, x n ) and d n = (y, y n ) for all n > 1. Define a map T : Hi — >• H 2 by 
To; = y if c„ = d n for all n in (|55|) . It is clear that T is linear and it maps Hi onto 
H 2 since the sequences (c„) and (d„) run through all of t 2 . Also, 

oo oo 

||T*|| 9 = £K| a = £|cJ a HWI a , 

n—1 n—1 

so T is an isometry. □ 

5. Gram Schmidt orthonormalization 

Starting with any linearly independent set {xi, x 2l . . . } is a a Hilbert space 
H, we can inductively construct an orthonormal set that spans the same subspace 
by the Gram-Schmidt Orthonormalization process (Theorem I5.27[) . The idea is 
simple: first, any vector v can be reduced to unit length simply by dividing by 
the length ||u||. Second, if xi is a fixed unit vector and x 2 is another unit vector 
with {xi,x 2 } linearly independent, then x 2 — (x 2 ,Xi)xi is a non-zero vector (since 
xi and x 2 are independent), is orthogonal to xi (since (xi,x 2 — (x 2 ,xi)xi) = 
(xi,x 2 ) — (x 2 ,xi)(xi,xi) = (xi,x 2 ) — (x 2 ,xi) — 0), and {xi,x 2 — (x 2l xi)xi} spans 
the same space as {xi,x 2 }. This idea can be extended as follows - the notational 
complexity comes about because of the need to renormalize (make the new vector 
unit length). 

We will only need this for sets whose linear span is dense. 

Theorem 5.27. If {xi, x 2 , . . . } is a linearly independent set whose linear span 
is dense in H , then the set {(f>i, 4> 2 , ... } defined below is an orthonormal basis for 
H: 

fa = n — if' 
Fill 

X 2 - (X2,<f>l)<f>l 



\\x 2 - (X2) 1| 

and in general for any n > 1, 

i _ X n - (x n , 0l)01 - (x n , 02)02 (x n , 0n- 



\\x n - (X n , 0001 - (X n , 02)02 ~ (x n , 0n- 
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The proof is obvious unless you try to write it down: the idea is that at each 
stage the piece of the next vector x n that is not orthogonal to the space spanned 
by {xi, . . . ,x„_i} is subtracted. The vector <p n so constructed cannot be zero by 
linear independence. 

The most important situation in which this is used is to find orthonormal bases 
for certain weighted function spaces. 

Given a < 6, a, b e [—00,00] and a function M : (a,b) — > (0, 00) with the 
property that j b t n M{t)dt < 00 for all n > 1, define the Hilbert space Lp[a,b] to 

1 /2 

be the linear space of measurable functions / with ||/||m = (/, /)m < 00 where 

(f,9)M= f M{t)f{t)g{t)dt. 

J a 

It may be shown that the linearly independent set {1, t, t 2 , t 3 , . . . } has a linear span 
dense in iJf . The Gram-Schmidt orthonormalization process may be applied to 
this set to produce various families of classical orthonormal functions. 

Example 5.28. [1] If M(t) = 1 for all t, a = -1, b = 1, then the process 
generates the Legendre polynomials. 

[2] If M(t) = ^^_ t 2 1 a = —1, b = 1, then the process generates the Tchebychev 
polynomials. 

[3] If M(t) = f-^l - t)P- q , a = 0, b = 1 (with q > and p - q > -1), then the 
process generates the Jacobi polynomials. 

[4] If M(t) = e~* , a = —00, b = 00, then the process generates the Hermite 
polynomials. 

[5] If M(t) = e~*, a = 0, b = 00, then the process generates the Laguerre polyno- 
mials. 



CHAPTER 6 



Fourier analysis 



In the last chapter we saw some very general methods of "Fourier analysis" in 
Hilbert space. Of course the methods started with the classical setting on periodic 
complex- valued functions on the real line, and in this chapter we describe the 
elementary theory of classical Fourier analysis using summability kernels. The 
classical theory of Fourier series is a huge subject: the introduction below comes 
mostly from Katznelson and from K6rnei0; both are highly recommended for 
further study. 



1. Fourier series of L\ functions 

Denote by ^i(T) the Banach space of complex- valued, Lebesgue integrable 
functions on T = [0, 27r)/0 ~ 27r (this just means periodic functions). 
Modify the Li-norm on this space so that 



2^ 



\f(t)\dt. 



What is going on here is simply this: to avoid writing "27r" hundreds of times, we 
make the unit circle have "length" 2n. To recover the useful normalization that the 
Li-norm of the constant function 1 is 1, the usual Li-norm is divided by 2n. 

Notice that the translate f x of a function has the same norm, where f x {t) = 
f(t-x). 

Definition 6.1. A trigonometric polynomial on T is an expression of the form 

N 

P(t) = a n e lnt , 

n=-N 

with a n G C. 

Lemma 6.2. The functions {e mt } ne z are pairwise orthogonal in L^. That is, 

— e int e- imt dt = — r e^-^dt = { \ l f n = m ' 
2tt Jq 2tt j [ U ij n f= m. 

It follows that if the function P(t) is given, we can recover the coefficients a n 
by computing 



1 C27T 

2^ 



P(t)e~ mt dt 



An introduction to Harmonic Analysis, Y. Katznelson, Dover Publications, New York (1976). 
Fourier Analysis, T. Korner, Cambridge University Press, Cambridge. 
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It will be useful later to write things like 

N 

p~ Yl a ^ mt 

n=-N 

which means that P is identified with the formal sum on the right hand side. The 
expression P{t) — . . . is a function defined by the value of the right hand side for 
each value of t. 

Definition 6.3. A trigonometric series on T is an expression 

oo 

(56) S~ Yl a ^ mt - 

71 — — OO 

The conjugate of 5 is the series 

oo 

(57) S~ Y -isign(n)a„e m * 



n— — oo 



where sign(n) = if n — and — n/\n\ if not. 

Notice that there is no assumption about convergence, so in general S is not 
related to a function at all. 

Definition 6.4. Let / e £i(T). Define the nth (classical) Fourier coefficient 
of / to be 

(58) f {n) = ±.Jf {t )e~ mt dt 

(the integration is from to 2tt as usual). Associate to / the Fourier series S[f], 
which is defined to be the formal trigonometric series 

oo 

(59) y f^y nt - 



n— — oo 



We say that a given trigonometric series ([55]) is a Fourier series if it is of the 
form for some / <E Li(T). 

Theorem 6.5. Let f,g e Li(T). Then 
[11 (f + g)(n)=J(n) + g(n). 
[2] For AeC, (A/)(n) = A/(n). 



A/ If f(t) — (f(t) is the complex conjugate of f then f(n) — f(—n). 
[4] If fx(t) = f(t — x) is the translate of f , then f x {n) — e~ lnx f{n). 
[5l\f(n)\<±S\f{t)\dt = \\f\\ x . 

Prove these as an exercise. 

Notice that / H> / sends a function in ii(TT) to a function in C(Z), the con- 
tinuous functions on Z with the sup norm. This map is continuous in the following 
sense. 

Corollary 6.6. Assume (fj) is a sequence in Li(T) with \\fj — f\\i — > 0. 
Then fj — > f uniformly. 

Proof. This follows at once from Theorem I6.5[ 5]. □ 



2. CONVOLUTION IN Li 
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Theorem 6.7. Lef / € L X (T) We /(0) = 0. Dearie 

F(t) = / /(*)<k. 
Jo 

T/ien .F is continuous, 2n periodic, and 



F(n) = -f(n) 
in 



for all n 0. 



Proof. It is clear that F is continuous since it is the integral of an L\ function. 
Also, 

f(s)ds = 2vr/(0) = 0. 

Finally, using integration by parts 

F( n ) = — / F(t) e - mt dt = / F'(t)—e- mt dt = — / 



... jn. 
2ir J 2tt Jq in in 

□ 

Notice that we have used the symbol F' - the function F is differentiable 
because of the way it was defined. 

2. Convolution in L\ 

In this section we introduce a form of "multiplication" on Li(T) that makes it 
into a Banach algebra (see Definition I3.10[) . Notice that the only real properties 
we will use is that the circle T is a group on which the measure ds is translation 
invariant: 

f x (s)ds = J fds. 

Theorem 6.8. Assume that f,g are in Li(T). Then, for almost every s, the 
function f{t — s)g(s) is integrable as a function of s. Define the convolution of f 
and g to be 

(60) (F*g){t) = ^J f(t-s)g(s)ds. 

Then f * g € Li(T), with norm 

||/*ff||i<ll/HiNli. 

Moreover 

(f*9){n) = f(n)g(n). 

Proof. It is clear that F(t,s) = f(t — s)g(s) is a measurable function of the 
variable (s, t). For almost all s, F(t, s) is a constant multiple of f s , so is integrable. 
Moreover 

hi {hi \ F ^ s ^ dt ) ds = kl i5( s )in/iii ds =ii/iiiNU- 

So, by Fubini's Theorem 14. Ill f(t—s)g(s) is integrable as a function of s for almost 
all t, and 

^ J \(f*g)(t)\dt=± J F(t, s )ds dt<^J J \F(t,8)\dtd8 = \\f\\ 1 \\g\\ 1 , 



62 



6. FOURIER ANALYSIS 



showing that \\f * g\\i < Finally, using Fubini again to justify a change 

in the order of integration, 

(f^9){n) = ^ f(f*g)(t)e- int dt = ± f f f (t - s)e~ in ^ g(s)dtds 



±ff®e-**dt.±f g ia)e-«"d. 
f(n)g(n). 



□ 

Lemma 6.9. The operation (f,g) ^ f*g is commutative, associative, and 
distributive over addition. 

Prove this as an exercise. 

Lemma 6.10. Iff e L X (T) and k{t) = T,n=-N a ne int then 

N 



(fc */)(<) = J2 a rJ( n Y 



n=-N 

Thus convolving with the function e mt picks out the nth Fourier coefficient. 
Proof. Simply check this one term at a time: if Xn(t) = e mt , then 

(X« * /)(*) = ^J e in ^f(s)ds = e mt ±- J f(s)e~ ms ds. 

□ 

3. Summability kernels and homogeneous Banach algebras 

Two properties of the Banach space L\(T) are particularly important for Fourier 
analysis. 

Theorem 6.11. Iff e Li(T) and x € T, then 

/*(«) = f(t-x) e Li(T) and \\f x \\x = \\fl\t. 

Also, the function x i— > f x is continuous on T for each f G L\(T). 

Proof. The translation invariance is clear. 

In order to prove the continuity we must show that 

(61) lim ||/ x -/ xo ||i=0. 

Now (pTj) is clear if / is continuous. On the other hand, the continuous functions 
are dense in £i(T), so given / e Li(T) and e > we may choose g € C(T) such 
that 

||<?-/||i<e. 

Then 

Wfx — /x ||i < Wfx — gx\\i + \\g x — fx \\i + \\gx — fx \\i 
= IK/ -g) x \\i + \\g* -Sxolli + -/)x ||i 

< 2e+ ||g x -gxolli- 

It follows that 

limsupH/^ - /-cJli < 2e, 



3. SUMMABILITY KERNELS AND HOMOGENEOUS BANACH ALGEBRAS 
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so the theorem is proved. □ 

Definition 6.12. A summability kernel is a sequence (k n ) of continuous 2tt- 
periodic functions with the following properties: 

(62) — / k n {t)dt = 1 for all n. 

2ir J 

(63) There is an R such that — / \k n (t)\dt < R for all n. 



/•2jr-r5 

(64) For all 8 > 0, lim / \k n (t)\dt = 0. 

If in addition k n {t) > for all n and t then (k n ) is called a positive summability 
kernel. 

Theorem 6.13. Let f e L±(T) and let (k n ) be a summability kernel. Then 
f = lim — f k n (s)f s ds 

71->oo ZTT J 

in the L\ norm. 

PROOF. Write 4>(s) = f s (t) = f(t - s) for fixed t. By Theorem 16.111 is 
a continuous Li(T)-valued function on T, and 0(0) = /. We will be integrating 
Li(T)-valued functions - see the Appendix for a brief definition of what this means. 

Then for any < 6 < tt, by (|6"2")l we have 

f k n (s)cj>( s )d s - 0(0) = i- /"fc n ( s )(0( s )-0( O ))d s 



1 '* 



2tt 



2tt 

The two parts may be estimated separately: 



k n {s) (0(s) - 0(0)) ds 

-8 

+ 7T fc„( s )(0(s)-0(O))ds. 



1 



(65) II—/ fc„( S )(0(s)-0(O))d S || 1 <max||0( S )-0(O)|| 1 ||fc„|| 1 , 

2tt J_s \s\<8 

and 
(66) 



II—/ fe n (s)(0(s)-0(O))rfs|| 1 <max||0(s)-0(O)||i— / |fc n (s)|ds. 

Using (|B3)) and the fact that is continuous at s = 0, given any e > there is 
a 6 > such that (|6"5)) is bounded by e. With the same 8, (jMf implies that ([6"6")l 
converges to as n — > oo, so that jp J k n (s)4>(s)ds — 0(0) is bounded by 2e for 
large u. □ 

The integral appearing in Theorem 16 . 1 31 looks a bit like a convolution of Li(T)- 
valued functions. This is not a problem for us. Consider first the following lemma. 

Lemma 6.14. Let k be a continuous function on T, and f G Li(T). Then 
(67) i- /" k( s )f s ds = k*f. 
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Proof. Assume first that / is continuous on T. Then, making the obvious 
definition for the integral, 

^ J H s )fsds = -^]imY^(sj+x-Sj)k{s j )f S:i , 

with the limit taken in the L\ (T) norm as the partition of T defined by {si, . . . , Sj, . . . } 
becomes finer. On the other hand, 

— limyVsj+i - s j )fc(s j )/(f - Sj) = (k* f)(t) 
Ztt * — ' 

3 

uniformly, proving the lemma for continuous /. 

For arbitrary / £ L±(T), fix e > and choose a continuous function g with 
\\f-g\U <e. Then 

^ / k(s)f s ds - fc * / = J- / fe( S )(/ - g) s ds + k*(g- /), 



so 



^ J k(s)f s ds ~ k* f 



< 2||fc|| ie . 



□ 



Lemma 16.141 means that Theorem 16.131 can be written in the form 

/ = lim k n * / in L\. 



4. Fejer's kernel 

Define a sequence of functions 

bl 

j=-n 



Lemma 6.15. The sequence (K n ) is a summability kernel. 



PROOF. Property <JH3> is clear. 
Now notice that 



1 



1 1 it 
e 

2 4 



tHSi) 



— (" 

n + 1 \ 4 



i p -»(n+l)t , i _ I„i(«+l)t 

+ 2 4 



On the other hand, 



sin 



t _ 1 
2 ~ 2 



(1 - cost) = e' lt + 



1 1 

2~V 



(69) 



1 f sin ^ 

^ w = ^TTl^hTir 



Property (|64l) follows, and this also shows that K n (t) > for all n and i. 
Prove property (|63[) as an exercise. 



□ 



4. FEJER'S KERNEL 

The following graph is the Fejer kernel K\\. 
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Definition 6.16. Write a n (f) = K n * /. 
Using Lemma T6. 101 it follows that 



(70) 



i - 



Me 



and means that 

*n(/) ^ / 

in the Li norm for every / <E Li(T). It follows at once that the trigonometric 
polynomials are dense in L%(T). The most important consequences are however 
more general statements about Fourier series. 

Theorem 6.17. Iff, g £ Li(T) have f(n) = g(n) for all n£ r L, then f = g. 

Proof. It is enough to show that f(n) = for all n implies that / = 0. Using 
(1701 . we see that if f(n) = for all n, then cr n (/) = for all n; since a n (f) — > f, it 
follows that / = 0. □ 

Corollary 6.18. The family of functions {e m *} ra6 z form a complete orthonor- 
mal system in Z/2(T). 

Proof. It is enough to notice that 

(/,e int ) = /(n). 

Then for all / £ ^(T), the function / and its Fourier series have identical Fourier 
coefficients, so must agree. □ 

We also find a very general statement about the decay of Fourier coefficients: 
the Riemann- Lebesgue Lemma. 



Theorem 6.19. Let f £ £i(T). Then lim 



M- 



f(n) = 0. 



Proof. Fix an e > 0, and choose a trigonometric polynomial P with the 
property that 

||/- .P||l <e. 
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If \n\ exceeds the degree of P, then 

\f(n)\ = \(f^P)(n)\<\\f-P\\ 1 <e. 

□ 

Recall that for / e L\(T), the Fourier series was defined (formally) to be 

oo 

S[f] ~ E /(»>*"*. 

71 — — OO 

and the nth partial sum corresponds to the function 

n 

(71) s n (m) = E /o> tfl - 

j=—n 

Looking at equations ([TTj) and (jTO")) . we see that <J n (f) is the arithmetic mean of 
S (f ),Sx(f ),..., S n (f)- 

(72) <r„(/) = — |- (S (/) + + • • • + Sn(/)) • 

n + 1 

It follows that if S n (f) converges in Li(T), then it must converge to the same thing 
as <7 ra , that is to / (if this is not clear to you, look at Corollary 16.221 below. 

The partial sums S n (f) also have a convolution form: using (|70[) we have that 
S n (f) — D n * f where (D n ) is the Dirichlet kernel defined by 

D n (t) = ± e* = ^±pl. 
r— 1 sin irt 

3=—n 1 

Notice that (D n ) is not a summability kernel: it has property ([S^)) but does not 
have (163)) (as we saw in Lemma T3. 161) nor does it have (l64|) . This explains why the 
question of convergence for Fourier series is so much more subtle than the problem 
of summability. The following graph is the Dirichlet kernel Dn. 




Definition 6.20. The de la Vallee Poussin kernel is defined by 
V n (t) = 2K 2n+1 (t) - K n (t). 
Properties ((62]) . (j63|) and (|64)) are clear. 



5. POINTWISE CONVERGENCE 

The next picture is the de la Vallee Poussin kernel with n = 11. 
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This kernel is useful because V n is a polynomial of degree 2n + l with V n (J) = 1 
for |j| < n + 1, so it may be used to construct approximations to a function / 
by trigonometric polynomials having the same Fourier coefficients as / for small 
frequencies. 

5. Pointwise convergence 

Recall that a sequence of elements (x n ) in a normed space (X, || • ||) converges 
to x if \\x n — a; 1 1 — > as n — »■ oo. If the space X is a space of complex- valued 
functions on some set Z (for example, Li(T), C(T)), then there is another notion 
of convergence: x n converges to x pointwise if for every z € Z, x n (z) — > x(z) 
as a sequence of complex numbers. The question addressed in this section is the 
following: does the Fourier series of a function converge pointwise to the original 
function? 

In the last section, we showed that for L\ functions on the circle, a n (f) con- 
verges to / with respect to the norm of any homogeneous Banach algebra containing 
/. Applying this to the Banach algebra of continuous functions with the sup norm, 
we have that <r n (f) —¥ / uniformly for all / e C(T). 

If the function / is not continuous on T, then the convergence in norm of a n (f) 
does not tell us anything about the pointwise convergence. In addition, if <r n (f,t) 
converges for some t, there is no real reason for the limit to be f(t). 

Theorem 6.21. Let f be a function in Li(T). 

(a) // 

Yim(f(t + h) + f(t-h)) 

h->-0 

exists (the possibility that the limit is ±oo is allowed), then 

a n (f,t) — ► \ lim^o (f(t + h) + f(t - h)) . 

(b) If f is continuous at t, then cr n (f,t) — 5- f(t). 

(c) If there is a closed interval I C T on which f is continuous, then a n (f,-) 
converges uniformly to f on I. 



()8 



6. FOURIER ANALYSIS 



Corollary 6.22. If f is continuous at t, and if the Fourier series of f con- 
verges at t, then it must converge to f(t). 

PROOF. Recall equation (fT2"j): 

<Jn(f) = — |-T (S (f) + S 1 (f) + ■■■ + S n (f)) ■ 

n + 1 

By assumption and (b), a n (f 7 t) — > f(t) and S n (f,t) — > S(t) say. Write the right 
hand side as 

-!- (5 (t) + 5i(t) + • • • + S^(t)) + -L- (% +1 (t) + • • • + S»(t)) . 

The first term converges to zero as n — > oo (since the convergent sequence (S n (t)) 
is bounded). For the second term, choose and fix e and choose n so large that 

\S k (t) - S(t)\ < e 

for all k >y/n. Then the whole second term is within (j^jpf^J e of S(t). It follows 
that 

— ^ (S (f) + Si(f) + ■ ■ ■ + S n (f)) -> S(t) 
n + 1 

as n —¥ oo, so S(t) must coincide with lim r n. 00 a n (f,t) = f(t). □ 

Turning to the proof of Theorem 16.211 recall that the Fejer kernel (K n ) (see 
Lemma |6.15[) is a positive summability kernel with the following properties: 

(73) lim ( sup K n {t) ] = for any 9 € (0,tt), 

n^oo \ g<t< 2w-8 ) 

and 

(74) K n (t) = K n (-t). 
Proof. Proof of Theorem l6.21I Define 

f(t)= Hm ~ (/(*+&) + /(* -/i)), 

and assume that this limit is finite (a similar argument works for the infinite cases). 
We wish to show that a n (f,t) — f(t) is small for large n. Evaluate the difference, 

*„(/,*) "/(t) = ^ / ' K n (T){f{t-T)-f{t))dT 



2tt 
1 

27 



tf„(r) (f{t-r)-f{t))dr 



2tt- 



+ / ir„(r)(/(t-r) -/(*)) dr. 

Applying (|74p this may be written 

(75, .„</, o - m - i (jf + jf ) JCW («^ll±&±l) - /(t) , *. 

Fix e > 0, and choose 6* G (0, 7r) small enough to ensure that 

_ /(*-r) + /(t + r) 



(76) r e H 



-/(*) 



< e, 
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()<) 



and choose N large enough to ensure that 

(77) n > N =>■ sup K n (r) < e. 

8<t<2tt-8 

Putting the estimates (|ToT) and (1771) into the expression ([75)1 gives 
M/, *)-/(*) | <e + e||/-/»|| 1 , 

which proves (a). 

Part (b) follows at once from (a). 

For (c), notic^l that / must be uniformly continuous on I. This means that 
(given e > 0) 9 can be chosen so that (|76f holds for all t € I and N depends only 
on 9 and e. This means that a uniform estimate of the form 

<e + e||/-/(t)|| 1 , 

can be found for all t € /. □ 



6. Lebesgue's Theorem 

The Fejer condition, that 

(78) m^r J (t+h) V {t ~ h) 

h— >0 Z 

exists is very strong, and is not preserved if the function / is modified on a null 
set. This means that property (|78[l is not really well-defined on L\. However, ([75)1 
implies another property: there is a number f(t) for which 

'■ h \f{t + h) + f{t-_h)_ _j 



(79) lim - 

/i->o h 



dr = 0. 



2 

This is a more robust condition, better suited to integrable functiontQ. 

Theorem 6.23. Iff has property fTffi ) att, then a n (f,t) -> f(t). In particular 
(by the footnote), for almost every value of t, o~ n (f,t) — > f(t). 

Corollary 6.24. If the Fourier series of f G £a(T) converges on a set F of 
positive measure, then almost everywhere on F the Fourier series must converge to 
f . In particular, a Fourier series that converges to zero almost everywhere must 
have all its coefficients equal to zero. 

Remark 6.1. The case of trigonometric series is different: a basic counter- 
example in the theory of trigonometric series is that there are non-zero trigono- 
metric series that converge to zero almost everywhere. On the other hand, a trigono- 
metric series that converges to zero everywhere must have all coefficients zerqj. 

PROOF. Proof of 16. 231 Recall the expression (1751) in the proof of Theorem l6.21[ 

(so, „„(/, o - / W _ i ( f + r) km r nt-T) + n + T) _ m \ dT _ 




A continuous function on a closed bounded interval is uniformly continuous. 

^There are functions / with the property that Fejer's condition H78H does not hold anywhere, 
but H79H does hold for any / £ Li(T), for almost all t with /(t) = f(t). This is described in 
volume 1 of Trigonometric Series, A. Zygmund, Cambridge University Press, Cambridge (1959). 

^See Chapter 5 of Ensembles parfaits et series trigonometriques, J. -P. Kahane and R. Salem, 
Hermann (1963). 
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Also, by ([55]). 
(81) 



K n (r) 



n + 1 



and sin 5 > - for < r < tt, so 



( 82 ) •■„>... , . -• , , 

' (n + ljr- 2 

It follows that the second integral in ([50]) will converge to zero so long as (n + 1)9 2 
does. Pick 6* = n -1 / 4 ; this guarantees that asn^oo the second integral tends to 
zero. 

Now consider the first integral. Write 

rh f(t + h) + f(t-h) 



Then 



- / Mr) 



f(t + r)+f(t-T) 



fit) 



fit) 



dr. 



dr 



is bounded above by 



1 


/•l/n 


1 

+ - 


f 


TT 


Jo 


7T 


J l/n 



n+1 



l/n 



f(t + r) + f(t-r) 



fit) 



(It 
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(we have used the estimate for K n from ([82]) '). By the assumption ([79]) . the first 
term s±lv[/(i) tends to zero. Apply integration by parts to the second term gives 
(83) 



n + 1 



l/n 



f(t + r)+f(t-r) 



-fit) 



dr 



n + 1 



9(t) 



l/r. 



2tt 
n+1 



*(r) 



dr. 



l/n 



For given e > and n > n(e) ([75]) gives 

*(t) < er for r € (0, 6 = n~ 1/4 ). 
It follows that (1551 is bounded above by 



2rre 



n+1 n + 1 J 1 j n t 



dr 



< 37T6, 



which completes the proof. 



□ 



APPENDIX A 



1. Zorn's lemma and Hamel bases 



Definition A.l. A partially ordered set or poset is a non-empty set S together 
with a relation < that satisfies the following conditions: 

(i) x < x for all x € S; 

(ii) if x < y and y < z then x < z for all x,y,z G S. 

If in addition for any two elements x, y of S at least one of the relations x < y 
or y < x holds, then we say that S is a totally ordered set. 

The set of subsets of a set X, with < meaning inclusion, defines a partially 
ordered set for example. 

Definition A.2. Let S be a partially ordered set, and T any subset of S. An 
element x £ S is an upper bound of T if y < x for all y <G T. 

Definition A. 3. Let S be a partially ordered set. An element S e S is 
maximal if for any y e S, x < y =^> y < x. 

The next result, Zorn's lemma, is one of the formulations of the Axiom of 
Choice. 

Theorem A. 4. If S is a partially ordered set in which every totally ordered 
subset has an upper bound, then S has a maximal element. 

This result is used frequently to "construct" things - though whenever we use it 
all we really are able to do is assert that something must exist subject to assuming 
the Axiom of Choice. An example is the following result - as usual, trivial in finite 
dimensions. 

To see that the following theorem is "constructing" something a little surprising, 
think of the following examples: E is a linear space over Q; L 2 [0, 1] is a linear space 
over M. 

Theorem A. 5. Let X be a linear space over any field. Then S contains a 
set A of linearly independent elements such that the linear subspace spanned by A 
coincides with X. 

Any such set A is called a Hamel basis for X. It is quite a different kind of 
object to the usual spanning set or basis used, where X is the closure of the span 
of the basis. If the Hamel basis is A — {xa}agA, then every element of X has a 
(unique) representation 




in which the sum is finite and the the a\ are scalars. 
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Proof. Let S be the set of subsets of X that comprise linearly independent 
elements, and write S — {A, B,C, . . .}. Define a partial ordering on S by A < B if 
and only if A C B. 

We first claim that if {A a } is a totally ordered subset of S, it has the upper 
bound B = \J a A a - In order to prove this, we must show that any finite number 
of elements x\,...,x n of B are linearly independent. Assume that Xi G A ai for 
i = l,...,n. Since the set {A a } is totally ordered, one of the subsets A aj con- 
tains all the others. It follows that {xi, . . . , x n } C A aj , so xi, . . . ,x n are linearly 
independent. 

We may therefore apply Theorem lA.4l to conclude that S has a maximal element 
A. If y 6 X is not a finite linear combination of elements of A, then the set 
B = A U {y} belongs to S (since it is linearly independent), and A < B, but it is 
not true that B < A, contradicting the maximality of A. 

It follows that every element of X is a finite linear combination of elements of 
A. □ 



2. Baire category theorem 

Most of the facts assembled here are really about metric spaces - normed spaces 
are a special case of metric spaces. 

A subset S C X of a normed space is nowhere dense if for every point x in the 
closure of S, and for every e > B e (x) <~) (X\S) is non-empty. 

The diameter of S c X is defined by 

diam(S') = sup \\a — b\\. 

a.beS 

Theorem A. 6. Let {F n } be a decreasing sequence of non-empty closed sets 
(this means F n D F n+ i for all n) in a complete normed space X. If the sequence 
of diameters diam(F n ) converges to zero, then there exists exactly one point in the 
intersection H^Li F n - 

Proof. If x and y are both in the intersection, then by the definition of the 
diameter, \\x — y\\ < diam(F„) — > so x = y. It follows that there can be no more 
than one point in the intersection. 

Now choose a point x n € F n for each n. Then \\x n — x m \\ < diam(i 7 ' n ) — > 
as n > m — > oo. Thus the sequence (x n ) is Cauchy, so has a limit x say by 
completeness. For any n, F n is a closed set that contains all the x m with m > n, 
so x G F n . It follows that x € H^Li ^n- D 

The next result is a version of the BaireQ category theorem. 

Theorem A. 7. A complete normed space cannot be written as a countable 
union of nowhere dense sets. 

In the langauge of metric spaces, this means that a complete normed space is 
of second category. 



Rene Baire (1874—1932) was one of the most influential French mathematicians of the early 
20th century. His interest in the general ideas of continuity was reinforced by Volterra. In 1905, 
Baire became professor of analysis at the Faculty of Science in Dijon. While there, he wrote an 
important treatise on discontinuous functions. Baire's category theorem bears his name today, as 
do two other important mathematical concepts, Baire functions and Baire classes. 
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PROOF. Let X be a complete normed space, and suppose that X — \J^. 1 Xj 
where each Xj is nowhere dense (that is, the sets Xj all have empty interior). Fix 
a ball Bx(xq). Since X\ does not contain Bi(xo) there must be a point xi e i?i(a;o) 
with x\ ^ X\. It follows that there is a ball B ri (xi) such that B ri (x\) C Bi(xo) 
and B ri (xi) H X = 0. Assume without loss of generality that r\ < \. 

Similarly, there is a point xi and a radius r2 such that B r2 (x2) C -B ri (a;i), and 
B r2 (x 2 )nX 2 = 0, and without loss of generality r 2 < |. Notice that -B r2 (^JflJi = 
automatically since B r2 (x 2 ) C B ri (xi). 

Inductively, we construct a sequence of decreasing closed balls B rn (x n ) such 
that B rn (x n ) n X, = for 1 < j < n, and r n — > as n — > oo. 

Now by Theorem IA.61 there must be a point x in the intersection of all the 
closed balls B rn (x n ), so x ^ Xj for all j > 1. This implies that x ^ Uj>\Xj = X, 
a contradiction. □ 



